You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Philip Martin <ph...@codematters.co.uk> on 2006/01/25 17:21:15 UTC

Re: svn commit: r18204 - trunk/contrib/hook-scripts

martinto@tigris.org writes:

> Author: martinto
> Date: Tue Jan 24 10:16:30 2006
> New Revision: 18204
>
> Modified:
>    trunk/contrib/hook-scripts/README
>    trunk/contrib/hook-scripts/case-insensitive.py
>
> Log:
> * contrib/hook-scripts/case-insensitive.py:
>   Output the 'clash' text as utf-8, previously ascii was used and
>   non-ascii characters in files names caused the error message at the
>   subversion client to read 'svn: General svn error from server'.
>   Non ascii names such as Àbc.txt and àbc.txt are now displayed at
>   the client when they clash.

> --- trunk/contrib/hook-scripts/case-insensitive.py	(original)
> +++ trunk/contrib/hook-scripts/case-insensitive.py	Tue Jan 24 10:16:30 2006
> @@ -85,11 +85,12 @@
>        clashes[canonical_path][join_path(dir, name_pair[1])] = True
>  
>  if (clashes):
> +  utfeight='utf-8'
>    for canonical_path in clashes.iterkeys():
> -    sys.stderr.write(u'Clash:'.encode(native))
> +    sys.stderr.write(u'Clash:'.encode(utfeight))
>      for path in clashes[canonical_path].iterkeys():
> -      sys.stderr.write(u' \''.encode(native) +
> -                       str(path).decode('utf-8').encode(native, 'replace') +
> -                       u'\''.encode(native))
> -    sys.stderr.write(u'\n'.encode(native))
> +      sys.stderr.write(u' \''.encode(utfeight) +
> +                       str(path).decode('utf-8').encode(utfeight, 'replace') +
> +                       u'\''.encode(utfeight))
> +    sys.stderr.write(u'\n'.encode(utfeight))
>    sys.exit(1)

Forcing the hook script to output UTF-8 is not correct, hook scripts
are supposed to output in the native encoding[1].  The first thing that
libsvn_repos does with the hook output is convert from the native
encoding to UTF-8, it makes no sense for the hook to output UTF-8
unless the native encoding is also UTF-8.  I believe the original code
was correct.

[1] httpd might be an exception since it doesn't set up the locales,
    in that case hook scripts probably need to restrict their output
    to plain ascii.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: svn commit: r18204 - trunk/contrib/hook-scripts

Posted by Philip Martin <ph...@codematters.co.uk>.
Philip Martin <ph...@codematters.co.uk> writes:

> $ LANG=en_GB svn mkdir http://localhost:8888/obj/repo/XXX`printf "\xe3"`

Oops!  That command doesn't match the error, this command does:

LANG=en_GB svn mkdir http://localhost:8888/obj/repo/xxx`printf "\xc3"`

> ../svn/subversion/libsvn_ra_dav/util.c:827: (apr_err=165001)
> svn: MERGE request failed on '/obj/repo'
> ../svn/subversion/libsvn_ra_dav/util.c:389: (apr_err=165001)
> svn: 'pre-commit' hook failed with error output:
> Clash: '/xxx?' '/xxx?'

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r18204 - trunk/contrib/hook-scripts

Posted by Philip Martin <ph...@codematters.co.uk>.
Martin Tomes <li...@tomes.org> writes:

> Philip Martin wrote:
>> martinto@tigris.org writes:
>>> Author: martinto
>>> Date: Tue Jan 24 10:16:30 2006
>>> New Revision: 18204
>>> Log:
>>> * contrib/hook-scripts/case-insensitive.py:
>>>   Output the 'clash' text as utf-8, previously ascii was used and
>>>   non-ascii characters in files names caused the error message at the
>>>   subversion client to read 'svn: General svn error from server'.
>>>   Non ascii names such as Àbc.txt and àbc.txt are now displayed at
>>>   the client when they clash.
>  >
>> Forcing the hook script to output UTF-8 is not correct, hook scripts
>> are supposed to output in the native encoding[1].  The first thing that
>> libsvn_repos does with the hook output is convert from the native
>> encoding to UTF-8, it makes no sense for the hook to output UTF-8
>> unless the native encoding is also UTF-8.  I believe the original code
>> was correct.
>> [1] httpd might be an exception since it doesn't set up the locales,
>>     in that case hook scripts probably need to restrict their output
>>     to plain ascii.
>
> Would it be possible to detect whether the hook script is being fired
> off from apache or svnserve?

Not easily, I suppose you could do something platform specfic and look
at parent process IDs.

> We have companies in France and Italy
> using Subversion so this hook really should detect non-ascii case
> conflicts.

It does detect such conflicts.  Using the r18203 version of the
script, i.e. without your change:

$ LANG=en_GB svnserve -dr.

$ LANG=en_GB svn mkdir svn://localhost/repo/xxx`printf "\xe3"`
Committed revision 1.

$ LANG=en_GB svn mkdir svn://localhost/repo/xxx`printf "\xc3"`
../svn/subversion/libsvn_repos/hooks.c:131: (apr_err=165001)
svn: 'pre-commit' hook failed with error output:
Clash: '/xxxÃ' '/xxxã'

../svn/subversion/svn/util.c:417: (apr_err=165001)
svn: Your commit message was left in a temporary file:
../svn/subversion/svn/util.c:417: (apr_err=165001)
svn:    'svn-commit.tmp'

I've shown the client and server using the same locale, but that's not
necessary.  What is necessary is that the server and the hook must use
the same locale.  [Note: I'm only dealing with message encoding here,
I'm ignoring the whole question of whether using a locale's
'convert-to-lowercase' is sufficient to detect name clashes and
whether client and server need to use the same locale for it to work.]

What I've shown above won't work with ra_dav because the httpd process
doesn't make a setlocale call and so httpd itself cannot do the
native/UTF-8 conversions for the non-ascii characters in the hook
output.  To work with httpd the script needs to force the output to
ascii using something like (again using the r18203 version):

Index: contrib/hook-scripts/case-insensitive.py
===================================================================
--- contrib/hook-scripts/case-insensitive.py	(revision 18203)
+++ contrib/hook-scripts/case-insensitive.py	(working copy)
@@ -85,6 +85,7 @@
       clashes[canonical_path][join_path(dir, name_pair[1])] = True
 
 if (clashes):
+  native = 'ascii' # for httpd
   for canonical_path in clashes.iterkeys():
     sys.stderr.write(u'Clash:'.encode(native))
     for path in clashes[canonical_path].iterkeys():


$ LANG=en_GB svn mkdir http://localhost:8888/obj/repo/XXX`printf "\xe3"`
../svn/subversion/libsvn_ra_dav/util.c:827: (apr_err=165001)
svn: MERGE request failed on '/obj/repo'
../svn/subversion/libsvn_ra_dav/util.c:389: (apr_err=165001)
svn: 'pre-commit' hook failed with error output:
Clash: '/xxx?' '/xxx?'

../svn/subversion/svn/util.c:417: (apr_err=165001)
svn: Your commit message was left in a temporary file:
../svn/subversion/svn/util.c:417: (apr_err=165001)
svn:    'svn-commit.2.tmp'


Forcing ascii output has the disadvantage that any non-ascii
characters show up as question marks, but there is no alternative if
using httpd.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: svn commit: r18204 - trunk/contrib/hook-scripts

Posted by Martin Tomes <li...@tomes.org>.
Philip Martin wrote:
> martinto@tigris.org writes:
>> Author: martinto
>> Date: Tue Jan 24 10:16:30 2006
>> New Revision: 18204
>> Log:
>> * contrib/hook-scripts/case-insensitive.py:
>>   Output the 'clash' text as utf-8, previously ascii was used and
>>   non-ascii characters in files names caused the error message at the
>>   subversion client to read 'svn: General svn error from server'.
>>   Non ascii names such as Àbc.txt and àbc.txt are now displayed at
>>   the client when they clash.
 >
> Forcing the hook script to output UTF-8 is not correct, hook scripts
> are supposed to output in the native encoding[1].  The first thing that
> libsvn_repos does with the hook output is convert from the native
> encoding to UTF-8, it makes no sense for the hook to output UTF-8
> unless the native encoding is also UTF-8.  I believe the original code
> was correct.
> 
> [1] httpd might be an exception since it doesn't set up the locales,
>     in that case hook scripts probably need to restrict their output
>     to plain ascii.

Would it be possible to detect whether the hook script is being fired 
off from apache or svnserve?  We have companies in France and Italy 
using Subversion so this hook really should detect non-ascii case conflicts.

-- 
Martin Tomes
echo 'martin at tomes x org x uk'\
  | sed -e 's/ x /\./g' -e 's/ at /@/'

Visit http://www.subversionary.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org