You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kobayashi Noritada <no...@dolphin.c.u-tokyo.ac.jp> on 2004/08/12 09:43:17 UTC

[l10n] Bug?: Unreadable characters displayed (in some messages)

Hello developers,

When using 1.1.0 RC2,
I found a problem about displaying messages with the locale being
'ja_JP.eucJP' or 'ja_JP.ujis'.
The problem is that some messages are displayed with unreadable characters.
At least, following 4 messages turn into unintelligible gibberish:

(in running 'svn log' without 'svn update' after the first commit...)
  #: clients/cmdline/log-cmd.c:201
  #, c-format
  msgid "No commit for revision 0.%s"

(in dumping with 'svnadmin dump' or loading with 'svnadmin load'...)
  #: libsvn_repos/dump.c:1069
  #, c-format
  msgid "* %s revision %ld.\n"

  #: libsvn_repos/load.c:801
  #, c-format
  msgid "<<< Started new transaction, based on original revision %ld\n"

  #: libsvn_repos/load.c:1122
  #, c-format
  msgid ""
  "\n"
  "------- Committed revision %ld >>>\n"
  "\n"

So far other help/info/error messages (e.g. messages in commiting locally,
results of running 'svn log', 'svn info' or 'svnlook history') are displayed
correctly in ja_JP.eucJP.
But I think other messages possibly turn into unintelligible gibberish.

Since such unreadable messages may be displayed when using 1.1.0 RC2 with
other language environments,
will you check this in your own language?
And does anyone have information about the cause of this problem?

-- 
|:  Noritada KOBAYASHI
|:  Dept. of General Systems Studies,
|:  Graduate School of Arts and Sciences, Univ. of Tokyo
|:  E-mail: nori1@dolphin.c.u-tokyo.ac.jp (preferable)
|:          nori@esa.c.u-tokyo.ac.jp

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [l10n] Bug?: Unreadable characters displayed (in some messages)

Posted by Kobayashi Noritada <no...@dolphin.c.u-tokyo.ac.jp>.
Hi,

> Yea. On it. Filed as #1997. I'd like it to block on 1.1 as far as the 'svn
> log' problem goes. It's easy to fix.

Thanks for filing and r10652.

I found another case and reported as a comment to #1997.

  3) 'svn diff' which uses apr_psprintf statements directly on the result of
  _() calls.

Involved messages:
(in the output from 'svn diff')
  #: libsvn_client/diff.c:267
  #, c-format
  msgid "%s\t(revision %ld)"

  #: libsvn_client/diff.c:270
  #, c-format
  msgid "%s\t(working copy)"

> > > At least, following 4 messages turn into unintelligible gibberish:
> > >
> > > (in running 'svn log' without 'svn update' after the first commit...)
> > >   #: clients/cmdline/log-cmd.c:201
> > >   #, c-format
> > >   msgid "No commit for revision 0.%s"
> > >
> > > (in dumping with 'svnadmin dump' or loading with 'svnadmin load'...)
> > >   #: libsvn_repos/dump.c:1069
> > >   #, c-format
> > >   msgid "* %s revision %ld.\n"
> > >
> > >   #: libsvn_repos/load.c:801
> > >   #, c-format
> > >   msgid "<<< Started new transaction, based on original revision %ld\n"
> > >
> > >   #: libsvn_repos/load.c:1122
> > >   #, c-format
> > >   msgid ""
> > >   "\n"
> > >   "------- Committed revision %ld >>>\n"
> > >   "\n"

Thanks,

-- 
|:  Noritada KOBAYASHI
|:  Dept. of General Systems Studies,
|:  Graduate School of Arts and Sciences, Univ. of Tokyo
|:  E-mail: nori1@dolphin.c.u-tokyo.ac.jp (preferable)
|:          nori@esa.c.u-tokyo.ac.jp

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [l10n] Bug?: Unreadable characters displayed (in some messages)

Posted by Erik Huelsmann <e....@gmx.net>.
> On Thu, 12 Aug 2004, Kobayashi Noritada wrote:
> 
> > When using 1.1.0 RC2,
> > I found a problem about displaying messages with the locale being
> > 'ja_JP.eucJP' or 'ja_JP.ujis'.
> > The problem is that some messages are displayed with unreadable
> characters.
> > At least, following 4 messages turn into unintelligible gibberish:
> >
> > (in running 'svn log' without 'svn update' after the first commit...)
> >   #: clients/cmdline/log-cmd.c:201
> >   #, c-format
> >   msgid "No commit for revision 0.%s"
> >
> > (in dumping with 'svnadmin dump' or loading with 'svnadmin load'...)
> >   #: libsvn_repos/dump.c:1069
> >   #, c-format
> >   msgid "* %s revision %ld.\n"
> >
> >   #: libsvn_repos/load.c:801
> >   #, c-format
> >   msgid "<<< Started new transaction, based on original revision %ld\n"
> >
> >   #: libsvn_repos/load.c:1122
> >   #, c-format
> >   msgid ""
> >   "\n"
> >   "------- Committed revision %ld >>>\n"
> >   "\n"
> >
> This is because we use svn_stream_printf for a message to the stdout, but
> that function doesn't convert from UTF-8 to the native encoding. We can
> fix this in two ways:
> - Manually convert from UTF-8 in these places.
> - Create a stream that converts from UTF-8 and acts like a filter.
>

Yea. On it. Filed as #1997. I'd like it to block on 1.1 as far as the 'svn
log' problem goes. It's easy to fix.

For the svnadmin problem, we indeed need to create a stream which uses
svn_cmdline_*printf to pass as the 'feedback stream'.

I started out working on the 'svn log' problem while I try to figure out
where to create a stream which is basically meant to be used as a
replacement for svn_cmdline-* functions....

> I'm not sure which is best. The first is minimal, I believe, so it might
> be most appropriate for back-porting to 1.1.
> 


bye,

Erik.

-- 
NEU: WLAN-Router f�r 0,- EUR* - auch f�r DSL-Wechsler!
GMX DSL = superg�nstig & kabellos http://www.gmx.net/de/go/dsl


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [l10n] Bug?: Unreadable characters displayed (in some messages)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Thu, 12 Aug 2004, Kobayashi Noritada wrote:

> When using 1.1.0 RC2,
> I found a problem about displaying messages with the locale being
> 'ja_JP.eucJP' or 'ja_JP.ujis'.
> The problem is that some messages are displayed with unreadable characters.
> At least, following 4 messages turn into unintelligible gibberish:
>
> (in running 'svn log' without 'svn update' after the first commit...)
>   #: clients/cmdline/log-cmd.c:201
>   #, c-format
>   msgid "No commit for revision 0.%s"
>
> (in dumping with 'svnadmin dump' or loading with 'svnadmin load'...)
>   #: libsvn_repos/dump.c:1069
>   #, c-format
>   msgid "* %s revision %ld.\n"
>
>   #: libsvn_repos/load.c:801
>   #, c-format
>   msgid "<<< Started new transaction, based on original revision %ld\n"
>
>   #: libsvn_repos/load.c:1122
>   #, c-format
>   msgid ""
>   "\n"
>   "------- Committed revision %ld >>>\n"
>   "\n"
>
This is because we use svn_stream_printf for a message to the stdout, but
that function doesn't convert from UTF-8 to the native encoding. We can
fix this in two ways:
- Manually convert from UTF-8 in these places.
- Create a stream that converts from UTF-8 and acts like a filter.

I'm not sure which is best. The first is minimal, I believe, so it might
be most appropriate for back-porting to 1.1.

Thanks, Kobayashi, for reporting and pointing at the messages.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [l10n] Bug?: Unreadable characters displayed (in some messages)

Posted by Kobayashi Noritada <no...@dolphin.c.u-tokyo.ac.jp>.
Hi Erik,

From: "Erik Huelsmann" <e....@gmx.net>
Subject: Re: [l10n] Bug?: Unreadable characters displayed (in some messages)
Date: Thu, 12 Aug 2004 12:27:12 +0200 (MEST)

> > When using 1.1.0 RC2,
> > I found a problem about displaying messages with the locale being
> > 'ja_JP.eucJP' or 'ja_JP.ujis'.
> > The problem is that some messages are displayed with unreadable
> > characters.
> > 
> 
> > So far other help/info/error messages (e.g. messages in commiting locally,
> > results of running 'svn log', 'svn info' or 'svnlook history') are
> > displayed
> > correctly in ja_JP.eucJP.
> > But I think other messages possibly turn into unintelligible gibberish.
> 
> Could you show us some output? What does the gibberish look like? strings of
> \x999 characters?

Sorry for missing descriptions in detail.
Now I know what unreadable characters are.
They are UTF-8 characters.

Since I am using a non-UTF8 language environment (EUC-JP), most of messages are
converted when displayed.
But described 4 messages are displayed without character encoding conversion.
So they look as gibberish.

For each of output messages, I compared (A) actual output, (B) expected output
in EUC-JP, and (C) UTF-8 output converted from (B).
Then I found:
* (A) == (B) != (C) for most of messages
* (A) == (C) != (B) for described 4 messages

> Do the translated or untranslated messages cause these problems? 

Many translated messages are displayed correctly.
It seems that whether translated or not does not matter.

> Can you give us more info (as much as you can think of) describing your
> system, svn version, message translation, etc?

The environment is as follows:
* OS: Debian GNU/Linux 3.0r2 (woody)
* terminal: kterm or mlterm
* $LANG: ja_JP.eucJP
* svn: compiled from the 1.1.0 RC2 tarball without modification (except for
  './configure --prefix=$HOME')
* message translation: ja.po included in 1.1.0 RC2 tarball (same as that in
  the /trunk/subversion/po/ja.po and /branches/1.1.x/subversion/po/ja.po)

Any other information required?

I'm sorry if this is due to libraries in my environment.
But so far I cannot distinguish among the possible causes.
Also I cannot understand why not all of the messages turn into gibberish
though I understand what unreadable characters are.

Regards,

-- 
|:  Noritada KOBAYASHI
|:  Dept. of General Systems Studies,
|:  Graduate School of Arts and Sciences, Univ. of Tokyo
|:  E-mail: nori1@dolphin.c.u-tokyo.ac.jp (preferable)
|:          nori@esa.c.u-tokyo.ac.jp

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [l10n] Bug?: Unreadable characters displayed (in some messages)

Posted by Erik Huelsmann <e....@gmx.net>.
> Hello developers,
> 
> When using 1.1.0 RC2,
> I found a problem about displaying messages with the locale being
> 'ja_JP.eucJP' or 'ja_JP.ujis'.
> The problem is that some messages are displayed with unreadable
> characters.
> At least, following 4 messages turn into unintelligible gibberish:
> 
> (in running 'svn log' without 'svn update' after the first commit...)
>   #: clients/cmdline/log-cmd.c:201
>   #, c-format
>   msgid "No commit for revision 0.%s"
> 
> (in dumping with 'svnadmin dump' or loading with 'svnadmin load'...)
>   #: libsvn_repos/dump.c:1069
>   #, c-format
>   msgid "* %s revision %ld.\n"
> 
>   #: libsvn_repos/load.c:801
>   #, c-format
>   msgid "<<< Started new transaction, based on original revision %ld\n"
> 
>   #: libsvn_repos/load.c:1122
>   #, c-format
>   msgid ""
>   "\n"
>   "------- Committed revision %ld >>>\n"
>   "\n"

> So far other help/info/error messages (e.g. messages in commiting locally,
> results of running 'svn log', 'svn info' or 'svnlook history') are
> displayed
> correctly in ja_JP.eucJP.
> But I think other messages possibly turn into unintelligible gibberish.

Could you show us some output? What does the gibberish look like? strings of
\x999 characters?

Do the translated or untranslated messages cause these problems? 

> Since such unreadable messages may be displayed when using 1.1.0 RC2 
> with other language environments,
> will you check this in your own language?
> And does anyone have information about the cause of this problem?

I use a latin-1 based language with UTF-8 characterset which does not show
any problems. I think that is not really a testcase since no conversion to a
local encoding is required.

Can you give us more info (as much as you can think of) describing your
system, svn version, message translation, etc?


bye,


Erik.

-- 
NEU: WLAN-Router f�r 0,- EUR* - auch f�r DSL-Wechsler!
GMX DSL = superg�nstig & kabellos http://www.gmx.net/de/go/dsl


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org