You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Ames <am...@gmail.com> on 2010/05/19 00:14:00 UTC

[PATCH - v3] readable error messages on non-ASCII systems

This version hits error.c only, no new APIs.  Sending as an attachment per
http://subversion.apache.org/docs/community-guide/general.html#patches

[[[ Produce readable error messages on non-ASCII systems.

* subversion/libsvn_subr/error.c:
      print_error():  convert err->message and the message prefix to UTF-8
before calling svn_cmdline_printf.
      svn_error_wrap_apr(): remove UTF-8 conversion of apr_strerror output
before storing it in err->message to avoid a double conversion
]]]

Greg

Re: [PATCH - v3] readable error messages on non-ASCII systems

Posted by Stefan Sperling <st...@elego.de>.
On Fri, May 21, 2010 at 01:23:13AM +0200, Stefan Sperling wrote:
> (I don't really see a need for svn to have its own ctype implementation),

The need seems to be that we cannot rely on every locale correctly
handling UTF-8 strings.
But it seems APR would be a better home for this kind of thing.

Stefan

[PATCH - v3] readable error messages on non-ASCII systems

Posted by Greg Ames <am...@gmail.com>.
forgot to cc:  dev again.

---------- Forwarded message ----------
From: Greg Ames <am...@gmail.com>
Date: Fri, May 21, 2010 at 10:28 AM
Subject: Re: [PATCH - v3] readable error messages on non-ASCII systems
To: Stefan Sperling <st...@elego.de>



On Thu, May 20, 2010 at 7:23 PM, Stefan Sperling <st...@elego.de> wrote:

> On Thu, May 20, 2010 at 03:10:04PM -0400, Greg Ames wrote:
> > I see that the svn ctype functions are hardwired to ASCII.  If they
> called
> > the equivalent apr ctype functions instead, they should become portable
> to
> > EBCDIC.  Doing that should change the behavior of
> > svn_utf__cstring_from_utf8_fuzzy() and produce readable normal messages
> as
> > well as the error messages.
>
> Have you tried tweaking svn_ctype.h to use the ctype functions from APR?
> Does that fix your problem?
>

I tried, but no luck the first time, so I just quickly re-tweaked it to use
the native OS ctypes.  That worked great!  Most of the messages are readable
now.  It's a rather convoluted code path but I can't knock success.

Looking back at APR's include/apr_lib.h, I see the problem:

/** @see isascii */
#ifdef isascii
#define apr_isascii(c) (isascii(((unsigned char)(c))))
#else
#define apr_isascii(c) (((c) & ~0x7f)==0)
#endif

I don't see anything that #defines isascii, so I fell into the #else which
is ASCII bit pattern dependent too.  I'll just tweak the #ifdef and it
should work fine for me.  The other APR ctypes are fine.
.

> If it helps we could either discard the custom ctype implementation
> (I don't really see a need for svn to have its own ctype implementation),
> or at least use the apr versions #ifdef'd for your platform.
>

I would think that APR should be its home eventually.  I'll submit an
#ifdef'd patch after I take care of / test APR.

>
> Have you looked at the old ebcdic branch?
> http://svn.apache.org/viewvc/subversion/branches/ebcdic/?r=876004
> You might want to try to get a diff of the work done on that branch
> relative to its origin, to see what people did back then to make
> Subversion deal with ebcdic.
>

 Yes I did look at it.  That should be a big help.

But although cheap branches are a very cool feature, I don't think huge code
drops in branches, patches, or otherwise are a good way to get code changes
into trunk in an open source project.   IMO you get more eyballs and better
reviews if you can decompose it into small patches.

Greg

Re: [PATCH - v3] readable error messages on non-ASCII systems

Posted by Stefan Sperling <st...@elego.de>.
On Thu, May 20, 2010 at 03:10:04PM -0400, Greg Ames wrote:
> I see that the svn ctype functions are hardwired to ASCII.  If they called
> the equivalent apr ctype functions instead, they should become portable to
> EBCDIC.  Doing that should change the behavior of
> svn_utf__cstring_from_utf8_fuzzy() and produce readable normal messages as
> well as the error messages.

Have you tried tweaking svn_ctype.h to use the ctype functions from APR?
Does that fix your problem?

If it helps we could either discard the custom ctype implementation
(I don't really see a need for svn to have its own ctype implementation),
or at least use the apr versions #ifdef'd for your platform.

Have you looked at the old ebcdic branch?
http://svn.apache.org/viewvc/subversion/branches/ebcdic/?r=876004
You might want to try to get a diff of the work done on that branch
relative to its origin, to see what people did back then to make
Subversion deal with ebcdic.

(Note that all I know about EBCDIC is what I read on wikipedia just
a minute ago -- but it sounds like you're in for a lot of fun...)

Stefan

Re: [PATCH - v3] readable error messages on non-ASCII systems

Posted by Greg Ames <am...@gmail.com>.
This isn't going to do the whole job, nor is patch #2.  If I grep for
'svn_cmdline_printf.*\"' to find where native literal strings are passed to
svn_cmdline_printf, I get 130 hits in trunk, for example:

./subversion/svn/util.c:    SVN_ERR(svn_cmdline_printf(pool, _("\nWarning:
%s\n"),
./subversion/svn/proplist-cmd.c:    SVN_ERR(svn_cmdline_printf(pool,
_("Properties on '%s':\n"), name_local));

So it's not just error messages that are a problem on z/OS.  Digging deeper,
I see that the svn ctype functions are hardwired to ASCII.  If they called
the equivalent apr ctype functions instead, they should become portable to
EBCDIC.  Doing that should change the behavior of
svn_utf__cstring_from_utf8_fuzzy() and produce readable normal messages as
well as the error messages.

Greg

On Tue, May 18, 2010 at 6:14 PM, Greg Ames <am...@gmail.com> wrote:

> This version hits error.c only, no new APIs.  Sending as an attachment per
> http://subversion.apache.org/docs/community-guide/general.html#patches
>
> [[[ Produce readable error messages on non-ASCII systems.
>
> * subversion/libsvn_subr/error.c:
>       print_error():  convert err->message and the message prefix to UTF-8
> before calling svn_cmdline_printf.
>       svn_error_wrap_apr(): remove UTF-8 conversion of apr_strerror output
> before storing it in err->message to avoid a double conversion
> ]]]
>
> Greg
>
>
>