You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Mattias Engdegård <ma...@bredband.net> on 2013/05/02 22:45:00 UTC

Make option parsing error messages translatable

It is quite annoying that some of the more common error messages, the

svn: invalid option: --gurgle

kind, are entirely untranslatable because they are printed directly by  
APR which does not have any concept of translation at all. (This was  
reported before but met with silence: http://svn.haxx.se/dev/archive-2008-05/1445.shtml)

It is not easy to fix it in a clean way; ideally, APR would return a  
detailed error code together with the required string parameters, but  
it doesn't. Even if we did change APR to that effect, it would be an  
incompatibility requiring a new APR version for use with Subversion.

This patch, instead, installs an apr_getopt_err_fn_t, acting as a  
replacement for fprintf. It is admittedly a hack, but quite a  
localised one -- it doesn't foul up any other code -- and does not  
have any negative effects that I can think of. In short, it should be  
a strict improvement.

Should APR's getopt error strings change, all that will happen is that  
they are shown untranslated, just like any other translated strings.


Re: Make option parsing error messages translatable

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Wed, May 08, 2013 at 15:34:58 +0200:
> On Wed, May 08, 2013 at 03:57:22PM +0300, Daniel Shahaf wrote:
> > I don't really have much to say to the rest of your mail that I didn't
> > already say.  It boils down to a disagreement over whether
> >     lambda x: !strcmp(x, "invalid option")
> > is a stable API.
> 
> I think this problem should be fixed in APR. APR should provide some
> official way of translating error messages.
> 
> APR is a dependency we are using and this dependency does not support
> localization. Let's recognize that as the real problem and lobby to
> get it fixed, rather than working around it.

While discussing Mattias' recent emails with him on IRC, I came up with
the following observations: tools/dev/aprerr.txt contains over 90 error
codes, for all of which svn_strerror() will call apr_strerror() and
return an English error message.

Re: Make option parsing error messages translatable

Posted by Mattias Engdegård <ma...@bredband.net>.
8 maj 2013 kl. 15.34 skrev Stefan Sperling:

> I think this problem should be fixed in APR. APR should provide some
> official way of translating error messages.

I agree. While it would be easy to do technically, getting such a  
change accepted into APR, a new APR release to be made, and a new  
Subversion release that requires that new APR version would likely  
take quite some time. Hence my stopgap solution, which works now, is  
robust, and can be done in 1.8.x.


Re: Make option parsing error messages translatable

Posted by Stefan Sperling <st...@elego.de>.
On Wed, May 08, 2013 at 03:57:22PM +0300, Daniel Shahaf wrote:
> I don't really have much to say to the rest of your mail that I didn't
> already say.  It boils down to a disagreement over whether
>     lambda x: !strcmp(x, "invalid option")
> is a stable API.

I think this problem should be fixed in APR. APR should provide some
official way of translating error messages.

APR is a dependency we are using and this dependency does not support
localization. Let's recognize that as the real problem and lobby to
get it fixed, rather than working around it.

Re: Make option parsing error messages translatable

Posted by Mattias Engdegård <ma...@bredband.net>.
8 maj 2013 kl. 14.57 skrev Daniel Shahaf:

> Apples and oranges.  The issue you're talking about is that _("hello
> world") would have to be translated differently in two places within  
> our
> code.  Your patch potentially uses _() either on string literals from
> APR's future code (which, unlike string literals in our code, we can't
> review pre-release for appropriateness of the translation to both of
> them) or even on user input.

When was the last time any such review was done? I doubt any has ever  
been performed.
But that doesn't actually matter. When a translator sees

#: somefile.c:123
msgid "some string"

in the .po file, his responsibility is to translate that string in  
general and as used in that particular spot in the source in  
particular. (Most of the time he will just write the obvious  
translation right away without so much as looking at the source. Only  
if the message is difficult to understand in isolation will the source  
be consulted at all.)

In this case, what he will find in the Subversion source are the five  
strings

"%s: invalid option: %s\n"
"%s: missing argument: %s\n"
"%s: erroneous argument: %s\n"
"%s: invalid option character: %c\n"
"%s: missing argument: %c\n"

in the command-line option parsing error handler. Those strings are  
quite trivial to translate as they are, but a translator bothering to  
look at the source would find that it confirms his intuition: these  
are option parsing error messages, and can have only one meaning.

In other words, it does not matter how or why APR generates them, or  
if this would change at any point in the future. The translation has  
been done based on the strings' occurrence in the Subversion source,  
not in APR.

I will gladly add a comment to translators looking that that part of  
the source that they should not try to take the exact origin of those  
strings in APR into account when translating, in the unlikely event  
that anyone would ever contemplate such an endeavour.


Re: Make option parsing error messages translatable

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Tue, May 07, 2013 at 21:44:00 +0200:
> 7 maj 2013 kl. 01.00 skrev Daniel Shahaf:
>
>> Of course there is, if someday APR uses "invalid option" in another
>> meaning than the one this patch assumes it does.
>
> Believe it or not, that is already how gettext translation works. If you 
> think that really is a problem, then Subversion has got it in spades. 
> Every time we mark up a string for translation with _(), there is the 
> danger that someone will add an identical string in another meaning 
> somewhere else, and nothing will alert us to the fact.
>

Apples and oranges.  The issue you're talking about is that _("hello
world") would have to be translated differently in two places within our
code.  Your patch potentially uses _() either on string literals from
APR's future code (which, unlike string literals in our code, we can't
review pre-release for appropriateness of the translation to both of
them) or even on user input.

I don't really have much to say to the rest of your mail that I didn't
already say.  It boils down to a disagreement over whether
    lambda x: !strcmp(x, "invalid option")
is a stable API.

Re: Make option parsing error messages translatable

Posted by Mattias Engdegård <ma...@bredband.net>.
7 maj 2013 kl. 01.00 skrev Daniel Shahaf:

> Of course there is, if someday APR uses "invalid option" in another
> meaning than the one this patch assumes it does.

Believe it or not, that is already how gettext translation works. If  
you think that really is a problem, then Subversion has got it in  
spades. Every time we mark up a string for translation with _(), there  
is the danger that someone will add an identical string in another  
meaning somewhere else, and nothing will alert us to the fact.

I admit feeling uneasy about this myself the first time I saw gettext,  
but have come to understand that it is quite robust in practice --  
witness the thousands of software packages successfully doing  
translation that way, Subversion included.

The main "danger", which I have mentioned before, is marking up  
strings that are too short or ambiguous in nature, in particular  
single words like "merge" or "yes". Such strings should generally be  
avoided or, if they cannot be, use gettext contexts (which are not  
used anywhere in Subversion yet). However, something as verbose as  
"%s: invalid option: %s\n" should be quite safe, and is not atypical  
of strings in our message catalogue.

(In addition, the very fact that these strings are sent to the  
function that should print out error messages during option parsing  
serves as a tremendously strong disambiguating force.)

> That's why API promises exist: so anticipating won't be needed.  The  
> API
> promises something and you work to that.  The API doesn't promise you
> something and you don't presume it'll happen.

We use the API to good effect here, as well as common sense. Regarding  
the semantics, I think everyone will agree that if APR tells the error  
printer "invalid option" during option parsing, it won't diverge much  
if at all in meaning for the next couple of centuries, no matter how  
wildly APR changes.

Note that the strings passed to the errfn never affect the logic of  
the program in any way, only their translations. This is also an  
interface of sorts, but a softer, human interface, the interface of  
translations, which carries its own rules and logic.

> Well yeah, it could grow an enum.
>
> Or it could just promise that "invalid option" in the 2nd replaceable
> argument will always be okay to translate.  That doesn't require any
> code changes, just a decision by the APR devs not to introduce other
> meanings to that particular string in the future.

Either would do, but neither is really necessary. In practice, the APR  
getopt code is as inert as they come - it basically hasn't been  
touched in over a decade. But sure, we could work on APR, and this  
patch would work nicely in the meantime.

> I've missed the previous report of that problem.  (Is that why Jens is
> on the CC?  I haven't seen him in a while.)

It was mentioned in the first message of this thread:
http://svn.haxx.se/dev/archive-2008-05/1445.shtml
(Jens, sorry to bother you, but I didn't anticipate such a lengthy  
discussion, and included you since you showed an interest in the past.  
If you want some peace and quiet, please don't hesitate to tell us off.)


Re: Make option parsing error messages translatable

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Tue, May 07, 2013 at 00:45:44 +0200:
> 6 maj 2013 kl. 23.39 skrev Daniel Shahaf:
> 
> >The patch achieves its end in an unmaintainable way.  It's liable to
> >break when APR's implementation details change, and since it
> >ultimately
> >attempts to parse something which wasn't intended to be parsed, it
> >might
> >do the _wrong_ thing (as opposed to just not doing anything) someday.
> 
> No, APR is constrained by its API; it cannot change arbitrarily.
> 

Aware of that, but "%s: %s: %c\n" is not an API promise, it's an
implementation detail.

> >I can't see today what other circumstances APR would call errfn->("%s:
> >%s: %c\n", _, "invalid option", _) in, but they don't guarantee us
> >that
> >there won't be other circumstances so I tend not to assume that they
> >won't.  That's how I treat Subversion APIs that I write and call, and
> >that's how I treat APR too: I rely on promises, not on implementation
> >details.
> 
> No, the patch will translate "%s: invalid option: %s\n" and a few
> other messages as specified by the .po files. There is no way it can
> "do the wrong thing".
> 

Of course there is, if someday APR uses "invalid option" in another
meaning than the one this patch assumes it does.

I already said, I don't see exactly how that can happen since errfn has
very few and specific usages --- but I don't want to assume no one will
ever find another creative use for errfn that your patch does not
anticipate.

That's why API promises exist: so anticipating won't be needed.  The API
promises something and you work to that.  The API doesn't promise you
something and you don't presume it'll happen.

> The worst that can happen is that someone changes the spelling or
> formatting of one of the messages in APR. The consequence is then
> that this message will be displayed untranslated, and that the
> translators may not immediately get to know about it since the
> changed strings are not seen by the gettext string extractor.
> Clearly, this is no worse than the present situation.
> 
> Yes, this patch relies on promises made by the APR - the errfn
> signature and semantics.
> 

No, not the signature, but the specific arguments and their meaning.

> Should APR eventually contain a more sophisticated error-reporting
> interface in the future, the proposed code could of course be
> retired in favour of a use of that interface. Until then, the patch
> would serve its purpose.
> 

Well yeah, it could grow an enum.

Or it could just promise that "invalid option" in the 2nd replaceable
argument will always be okay to translate.  That doesn't require any
code changes, just a decision by the APR devs not to introduce other
meanings to that particular string in the future.

> >>Since the patch:
> >>1) solves an annoying problem,
> >
> >Which no one complained about.
> 
> Two translators have pointed out the defect. Either should be
> sufficient.
> 

I've missed the previous report of that problem.  (Is that why Jens is
on the CC?  I haven't seen him in a while.)

> >Keep in mind that error message in the libraries are translated, and
> >'svn help' is translated.  Is it really that important to translate
> >'invalid option'?  If it is, wouldn't someone have complained about it
> >on users@?
> 
> Surely the worthiness of bugs to be fixed isn't measured in the
> number of complaints to the users@ mailing list?

The facts are that two translators complained and no user complained.
It's not a question of "5 users v. 5000 users".

> Of course it's important to translate that message -- if
> localisation is important at all, that is.
> 
> >Feel free to point me at another place in svn's code that ignores API
> >non-promises or parses something that wasn't intended to be parsed.
> 
> The suggested patch does not misuse the API. Should APR change,
> crashes or corruption cannot occur; the API guarantees that.

That's correct, it won't crash.  It might print a different message than
APR intended.

Re: Make option parsing error messages translatable

Posted by Mattias Engdegård <ma...@bredband.net>.
6 maj 2013 kl. 23.39 skrev Daniel Shahaf:

> The patch achieves its end in an unmaintainable way.  It's liable to
> break when APR's implementation details change, and since it  
> ultimately
> attempts to parse something which wasn't intended to be parsed, it  
> might
> do the _wrong_ thing (as opposed to just not doing anything) someday.

No, APR is constrained by its API; it cannot change arbitrarily.

> I can't see today what other circumstances APR would call errfn->("%s:
> %s: %c\n", _, "invalid option", _) in, but they don't guarantee us  
> that
> there won't be other circumstances so I tend not to assume that they
> won't.  That's how I treat Subversion APIs that I write and call, and
> that's how I treat APR too: I rely on promises, not on implementation
> details.

No, the patch will translate "%s: invalid option: %s\n" and a few  
other messages as specified by the .po files. There is no way it can  
"do the wrong thing".

The worst that can happen is that someone changes the spelling or  
formatting of one of the messages in APR. The consequence is then that  
this message will be displayed untranslated, and that the translators  
may not immediately get to know about it since the changed strings are  
not seen by the gettext string extractor. Clearly, this is no worse  
than the present situation.

Yes, this patch relies on promises made by the APR - the errfn  
signature and semantics.

Should APR eventually contain a more sophisticated error-reporting  
interface in the future, the proposed code could of course be retired  
in favour of a use of that interface. Until then, the patch would  
serve its purpose.

>> Since the patch:
>> 1) solves an annoying problem,
>
> Which no one complained about.

Two translators have pointed out the defect. Either should be  
sufficient.

> Keep in mind that error message in the libraries are translated, and
> 'svn help' is translated.  Is it really that important to translate
> 'invalid option'?  If it is, wouldn't someone have complained about it
> on users@?

Surely the worthiness of bugs to be fixed isn't measured in the number  
of complaints to the users@ mailing list?
Of course it's important to translate that message -- if localisation  
is important at all, that is.

> Feel free to point me at another place in svn's code that ignores API
> non-promises or parses something that wasn't intended to be parsed.

The suggested patch does not misuse the API. Should APR change,  
crashes or corruption cannot occur; the API guarantees that.


Re: Make option parsing error messages translatable

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Sat, May 04, 2013 at 13:55:25 +0200:
> 3 maj 2013 kl. 16.10 skrev Daniel Shahaf:
> 
> >It's a hack.  It hardcodes the specific first and third arguments
> >to the
> >   (os->errfn)("%s: %s: %c\n", _, "invalid option", _);
> >call in APR's unix/getopt.c.  The only benefit we get in return is
> >translating six two-word error messages in option parsing.
> 
> This "only" benefit is precisely the one intended: to make a small
> set of important error messages translatable.
> Could you please enumerate the negative effects the patch would
> have, in your opinion?
> 

The patch achieves its end in an unmaintainable way.  It's liable to
break when APR's implementation details change, and since it ultimately
attempts to parse something which wasn't intended to be parsed, it might
do the _wrong_ thing (as opposed to just not doing anything) someday.

I can't see today what other circumstances APR would call errfn->("%s:
%s: %c\n", _, "invalid option", _) in, but they don't guarantee us that
there won't be other circumstances so I tend not to assume that they
won't.  That's how I treat Subversion APIs that I write and call, and
that's how I treat APR too: I rely on promises, not on implementation
details.

> Since the patch:
>  1) solves an annoying problem,

Which no one complained about.

Keep in mind that error message in the libraries are translated, and
'svn help' is translated.  Is it really that important to translate
'invalid option'?  If it is, wouldn't someone have complained about it
on users@?

>  2) does so by adding a small amount of code in a single place,
>  3) with robust failure modes (if anything breaks, we will just fall
> back to the old behaviour),
>  4) without changing the behaviour in any way whatsoever for those
> living in LANG=C,
>  5) a "proper" solution is likely to be a lot more invasive, involve
> cross-project work, require new library versions and certainly would
> not be done for 1.8.x for any x, and most likely will never happen,
> 
> it does not seem to be something that should be very controversial,
> should it?

Feel free to point me at another place in svn's code that ignores API
non-promises or parses something that wasn't intended to be parsed.

We don't do that, and we frown upon users who try to do that to us.
(For example, we advised users not to use rsync with FSFS repository for
years before we actually implemented something that caused that to be a
potentially-corruption-causing workflow.)

Re: Make option parsing error messages translatable

Posted by Mattias Engdegård <ma...@bredband.net>.
3 maj 2013 kl. 16.10 skrev Daniel Shahaf:

> It's a hack.  It hardcodes the specific first and third arguments to  
> the
>    (os->errfn)("%s: %s: %c\n", _, "invalid option", _);
> call in APR's unix/getopt.c.  The only benefit we get in return is
> translating six two-word error messages in option parsing.

This "only" benefit is precisely the one intended: to make a small set  
of important error messages translatable.
Could you please enumerate the negative effects the patch would have,  
in your opinion?

Since the patch:
  1) solves an annoying problem,
  2) does so by adding a small amount of code in a single place,
  3) with robust failure modes (if anything breaks, we will just fall  
back to the old behaviour),
  4) without changing the behaviour in any way whatsoever for those  
living in LANG=C,
  5) a "proper" solution is likely to be a lot more invasive, involve  
cross-project work, require new library versions and certainly would  
not be done for 1.8.x for any x, and most likely will never happen,

it does not seem to be something that should be very controversial,  
should it?
(That doesn't mean it cannot be improved technically, of course.)

> Anyway, how about this:
>
> static void
> emit_option_error(void *baton, const char *fmt, ...)
> {
>  struct { FILE *stream; const char *argv0; apr_pool_t *pool; } *b =  
> baton;
>  va_list va;
>  va_start(va, fmt);
>  vfprintf(b->stream, fmt, va);
>  va_end(va);
>  svn_error_clear(svn_cmdline_fprintf(b->pool,
>                                      _("%s: See '%s --help' or '%s -- 
> help <subcommand>'"),
>                                      b->argv0, b->argv0, b->argv0));
>  return;
> }

That does not address the problem at all; the apr_getopt messages are  
still shown untranslated.

Besides, there already is a translated message after a failed option  
parse ("Type '%s help' for usage.\n") and it is emitted after the  
getopt has returned with an error, where it should be.


Re: Make option parsing error messages translatable

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
C. Michael Pilato wrote on Fri, May 03, 2013 at 08:56:25 -0400:
> On 05/02/2013 04:45 PM, Mattias Engdegård wrote:
> > It is quite annoying that some of the more common error messages, the
> > 
> > svn: invalid option: --gurgle
> > 
> > kind, are entirely untranslatable because they are printed directly by 
> > APR which does not have any concept of translation at all. (This was 
> > reported before but met with silence:
> > http://svn.haxx.se/dev/archive-2008-05/1445.shtml)
> > 
> > It is not easy to fix it in a clean way; ideally, APR would return a 
> > detailed error code together with the required string parameters, but 
> > it doesn't. Even if we did change APR to that effect, it would be an 
> > incompatibility requiring a new APR version for use with Subversion.
> > 
> > This patch, instead, installs an apr_getopt_err_fn_t, acting as a 
> > replacement for fprintf. It is admittedly a hack, but quite a 
> > localised one -- it doesn't foul up any other code -- and does not 
> > have any negative effects that I can think of. In short, it should be 
> > a strict improvement.
> > 
> > Should APR's getopt error strings change, all that will happen is that 
> > they are shown untranslated, just like any other translated strings.
> 
> It's a hack, to be sure, but it doesn't strike me as one that carries a
> particularly high penalty.  Plus, it's ultimately not much more of a hack
> than we already employ in *much* more commonly executed code -- the code
> which determines which of the svn_error_t's in the chain are "tracing
> errors".  (See svn_error__is_tracing_link().)
> 
> +1 (from me, at least) to commit.  That said, I know danielsh voiced some
> pretty strong opinions the opposite direction about this on IRC, so please
> respect those and see if he's willing to budge before moving forward.

It's a hack.  It hardcodes the specific first and third arguments to the
    (os->errfn)("%s: %s: %c\n", _, "invalid option", _);
call in APR's unix/getopt.c.  The only benefit we get in return is
translating six two-word error messages in option parsing.

Anyway, how about this:

static void
emit_option_error(void *baton, const char *fmt, ...)
{
  struct { FILE *stream; const char *argv0; apr_pool_t *pool; } *b = baton;
  va_list va;
  va_start(va, fmt);
  vfprintf(b->stream, fmt, va);
  va_end(va);
  svn_error_clear(svn_cmdline_fprintf(b->pool, 
                                      _("%s: See '%s --help' or '%s --help <subcommand>'"),
                                      b->argv0, b->argv0, b->argv0));
  return;
}

Re: Make option parsing error messages translatable

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 05/02/2013 04:45 PM, Mattias Engdegård wrote:
> It is quite annoying that some of the more common error messages, the
> 
> svn: invalid option: --gurgle
> 
> kind, are entirely untranslatable because they are printed directly by 
> APR which does not have any concept of translation at all. (This was 
> reported before but met with silence:
> http://svn.haxx.se/dev/archive-2008-05/1445.shtml)
> 
> It is not easy to fix it in a clean way; ideally, APR would return a 
> detailed error code together with the required string parameters, but 
> it doesn't. Even if we did change APR to that effect, it would be an 
> incompatibility requiring a new APR version for use with Subversion.
> 
> This patch, instead, installs an apr_getopt_err_fn_t, acting as a 
> replacement for fprintf. It is admittedly a hack, but quite a 
> localised one -- it doesn't foul up any other code -- and does not 
> have any negative effects that I can think of. In short, it should be 
> a strict improvement.
> 
> Should APR's getopt error strings change, all that will happen is that 
> they are shown untranslated, just like any other translated strings.

It's a hack, to be sure, but it doesn't strike me as one that carries a
particularly high penalty.  Plus, it's ultimately not much more of a hack
than we already employ in *much* more commonly executed code -- the code
which determines which of the svn_error_t's in the chain are "tracing
errors".  (See svn_error__is_tracing_link().)

+1 (from me, at least) to commit.  That said, I know danielsh voiced some
pretty strong opinions the opposite direction about this on IRC, so please
respect those and see if he's willing to budge before moving forward.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development