You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Mattias Engdegård <ma...@bredband.net> on 2013/05/02 22:45:00 UTC
Make option parsing error messages translatable
It is quite annoying that some of the more common error messages, the
svn: invalid option: --gurgle
kind, are entirely untranslatable because they are printed directly by
APR which does not have any concept of translation at all. (This was
reported before but met with silence: http://svn.haxx.se/dev/archive-2008-05/1445.shtml)
It is not easy to fix it in a clean way; ideally, APR would return a
detailed error code together with the required string parameters, but
it doesn't. Even if we did change APR to that effect, it would be an
incompatibility requiring a new APR version for use with Subversion.
This patch, instead, installs an apr_getopt_err_fn_t, acting as a
replacement for fprintf. It is admittedly a hack, but quite a
localised one -- it doesn't foul up any other code -- and does not
have any negative effects that I can think of. In short, it should be
a strict improvement.
Should APR's getopt error strings change, all that will happen is that
they are shown untranslated, just like any other translated strings.
Re: Make option parsing error messages translatable
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Wed, May 08, 2013 at 15:34:58 +0200:
> On Wed, May 08, 2013 at 03:57:22PM +0300, Daniel Shahaf wrote:
> > I don't really have much to say to the rest of your mail that I didn't
> > already say. It boils down to a disagreement over whether
> > lambda x: !strcmp(x, "invalid option")
> > is a stable API.
>
> I think this problem should be fixed in APR. APR should provide some
> official way of translating error messages.
>
> APR is a dependency we are using and this dependency does not support
> localization. Let's recognize that as the real problem and lobby to
> get it fixed, rather than working around it.
While discussing Mattias' recent emails with him on IRC, I came up with
the following observations: tools/dev/aprerr.txt contains over 90 error
codes, for all of which svn_strerror() will call apr_strerror() and
return an English error message.
Re: Make option parsing error messages translatable
Posted by Mattias Engdegård <ma...@bredband.net>.
8 maj 2013 kl. 15.34 skrev Stefan Sperling:
> I think this problem should be fixed in APR. APR should provide some
> official way of translating error messages.
I agree. While it would be easy to do technically, getting such a
change accepted into APR, a new APR release to be made, and a new
Subversion release that requires that new APR version would likely
take quite some time. Hence my stopgap solution, which works now, is
robust, and can be done in 1.8.x.
Re: Make option parsing error messages translatable
Posted by Stefan Sperling <st...@elego.de>.
On Wed, May 08, 2013 at 03:57:22PM +0300, Daniel Shahaf wrote:
> I don't really have much to say to the rest of your mail that I didn't
> already say. It boils down to a disagreement over whether
> lambda x: !strcmp(x, "invalid option")
> is a stable API.
I think this problem should be fixed in APR. APR should provide some
official way of translating error messages.
APR is a dependency we are using and this dependency does not support
localization. Let's recognize that as the real problem and lobby to
get it fixed, rather than working around it.
Re: Make option parsing error messages translatable
Posted by Mattias Engdegård <ma...@bredband.net>.
8 maj 2013 kl. 14.57 skrev Daniel Shahaf:
> Apples and oranges. The issue you're talking about is that _("hello
> world") would have to be translated differently in two places within
> our
> code. Your patch potentially uses _() either on string literals from
> APR's future code (which, unlike string literals in our code, we can't
> review pre-release for appropriateness of the translation to both of
> them) or even on user input.
When was the last time any such review was done? I doubt any has ever
been performed.
But that doesn't actually matter. When a translator sees
#: somefile.c:123
msgid "some string"
in the .po file, his responsibility is to translate that string in
general and as used in that particular spot in the source in
particular. (Most of the time he will just write the obvious
translation right away without so much as looking at the source. Only
if the message is difficult to understand in isolation will the source
be consulted at all.)
In this case, what he will find in the Subversion source are the five
strings
"%s: invalid option: %s\n"
"%s: missing argument: %s\n"
"%s: erroneous argument: %s\n"
"%s: invalid option character: %c\n"
"%s: missing argument: %c\n"
in the command-line option parsing error handler. Those strings are
quite trivial to translate as they are, but a translator bothering to
look at the source would find that it confirms his intuition: these
are option parsing error messages, and can have only one meaning.
In other words, it does not matter how or why APR generates them, or
if this would change at any point in the future. The translation has
been done based on the strings' occurrence in the Subversion source,
not in APR.
I will gladly add a comment to translators looking that that part of
the source that they should not try to take the exact origin of those
strings in APR into account when translating, in the unlikely event
that anyone would ever contemplate such an endeavour.
Re: Make option parsing error messages translatable
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Tue, May 07, 2013 at 21:44:00 +0200:
> 7 maj 2013 kl. 01.00 skrev Daniel Shahaf:
>
>> Of course there is, if someday APR uses "invalid option" in another
>> meaning than the one this patch assumes it does.
>
> Believe it or not, that is already how gettext translation works. If you
> think that really is a problem, then Subversion has got it in spades.
> Every time we mark up a string for translation with _(), there is the
> danger that someone will add an identical string in another meaning
> somewhere else, and nothing will alert us to the fact.
>
Apples and oranges. The issue you're talking about is that _("hello
world") would have to be translated differently in two places within our
code. Your patch potentially uses _() either on string literals from
APR's future code (which, unlike string literals in our code, we can't
review pre-release for appropriateness of the translation to both of
them) or even on user input.
I don't really have much to say to the rest of your mail that I didn't
already say. It boils down to a disagreement over whether
lambda x: !strcmp(x, "invalid option")
is a stable API.
Re: Make option parsing error messages translatable
Posted by Mattias Engdegård <ma...@bredband.net>.
7 maj 2013 kl. 01.00 skrev Daniel Shahaf:
> Of course there is, if someday APR uses "invalid option" in another
> meaning than the one this patch assumes it does.
Believe it or not, that is already how gettext translation works. If
you think that really is a problem, then Subversion has got it in
spades. Every time we mark up a string for translation with _(), there
is the danger that someone will add an identical string in another
meaning somewhere else, and nothing will alert us to the fact.
I admit feeling uneasy about this myself the first time I saw gettext,
but have come to understand that it is quite robust in practice --
witness the thousands of software packages successfully doing
translation that way, Subversion included.
The main "danger", which I have mentioned before, is marking up
strings that are too short or ambiguous in nature, in particular
single words like "merge" or "yes". Such strings should generally be
avoided or, if they cannot be, use gettext contexts (which are not
used anywhere in Subversion yet). However, something as verbose as
"%s: invalid option: %s\n" should be quite safe, and is not atypical
of strings in our message catalogue.
(In addition, the very fact that these strings are sent to the
function that should print out error messages during option parsing
serves as a tremendously strong disambiguating force.)
> That's why API promises exist: so anticipating won't be needed. The
> API
> promises something and you work to that. The API doesn't promise you
> something and you don't presume it'll happen.
We use the API to good effect here, as well as common sense. Regarding
the semantics, I think everyone will agree that if APR tells the error
printer "invalid option" during option parsing, it won't diverge much
if at all in meaning for the next couple of centuries, no matter how
wildly APR changes.
Note that the strings passed to the errfn never affect the logic of
the program in any way, only their translations. This is also an
interface of sorts, but a softer, human interface, the interface of
translations, which carries its own rules and logic.
> Well yeah, it could grow an enum.
>
> Or it could just promise that "invalid option" in the 2nd replaceable
> argument will always be okay to translate. That doesn't require any
> code changes, just a decision by the APR devs not to introduce other
> meanings to that particular string in the future.
Either would do, but neither is really necessary. In practice, the APR
getopt code is as inert as they come - it basically hasn't been
touched in over a decade. But sure, we could work on APR, and this
patch would work nicely in the meantime.
> I've missed the previous report of that problem. (Is that why Jens is
> on the CC? I haven't seen him in a while.)
It was mentioned in the first message of this thread:
http://svn.haxx.se/dev/archive-2008-05/1445.shtml
(Jens, sorry to bother you, but I didn't anticipate such a lengthy
discussion, and included you since you showed an interest in the past.
If you want some peace and quiet, please don't hesitate to tell us off.)
Re: Make option parsing error messages translatable
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Tue, May 07, 2013 at 00:45:44 +0200:
> 6 maj 2013 kl. 23.39 skrev Daniel Shahaf:
>
> >The patch achieves its end in an unmaintainable way. It's liable to
> >break when APR's implementation details change, and since it
> >ultimately
> >attempts to parse something which wasn't intended to be parsed, it
> >might
> >do the _wrong_ thing (as opposed to just not doing anything) someday.
>
> No, APR is constrained by its API; it cannot change arbitrarily.
>
Aware of that, but "%s: %s: %c\n" is not an API promise, it's an
implementation detail.
> >I can't see today what other circumstances APR would call errfn->("%s:
> >%s: %c\n", _, "invalid option", _) in, but they don't guarantee us
> >that
> >there won't be other circumstances so I tend not to assume that they
> >won't. That's how I treat Subversion APIs that I write and call, and
> >that's how I treat APR too: I rely on promises, not on implementation
> >details.
>
> No, the patch will translate "%s: invalid option: %s\n" and a few
> other messages as specified by the .po files. There is no way it can
> "do the wrong thing".
>
Of course there is, if someday APR uses "invalid option" in another
meaning than the one this patch assumes it does.
I already said, I don't see exactly how that can happen since errfn has
very few and specific usages --- but I don't want to assume no one will
ever find another creative use for errfn that your patch does not
anticipate.
That's why API promises exist: so anticipating won't be needed. The API
promises something and you work to that. The API doesn't promise you
something and you don't presume it'll happen.
> The worst that can happen is that someone changes the spelling or
> formatting of one of the messages in APR. The consequence is then
> that this message will be displayed untranslated, and that the
> translators may not immediately get to know about it since the
> changed strings are not seen by the gettext string extractor.
> Clearly, this is no worse than the present situation.
>
> Yes, this patch relies on promises made by the APR - the errfn
> signature and semantics.
>
No, not the signature, but the specific arguments and their meaning.
> Should APR eventually contain a more sophisticated error-reporting
> interface in the future, the proposed code could of course be
> retired in favour of a use of that interface. Until then, the patch
> would serve its purpose.
>
Well yeah, it could grow an enum.
Or it could just promise that "invalid option" in the 2nd replaceable
argument will always be okay to translate. That doesn't require any
code changes, just a decision by the APR devs not to introduce other
meanings to that particular string in the future.
> >>Since the patch:
> >>1) solves an annoying problem,
> >
> >Which no one complained about.
>
> Two translators have pointed out the defect. Either should be
> sufficient.
>
I've missed the previous report of that problem. (Is that why Jens is
on the CC? I haven't seen him in a while.)
> >Keep in mind that error message in the libraries are translated, and
> >'svn help' is translated. Is it really that important to translate
> >'invalid option'? If it is, wouldn't someone have complained about it
> >on users@?
>
> Surely the worthiness of bugs to be fixed isn't measured in the
> number of complaints to the users@ mailing list?
The facts are that two translators complained and no user complained.
It's not a question of "5 users v. 5000 users".
> Of course it's important to translate that message -- if
> localisation is important at all, that is.
>
> >Feel free to point me at another place in svn's code that ignores API
> >non-promises or parses something that wasn't intended to be parsed.
>
> The suggested patch does not misuse the API. Should APR change,
> crashes or corruption cannot occur; the API guarantees that.
That's correct, it won't crash. It might print a different message than
APR intended.
Re: Make option parsing error messages translatable
Posted by Mattias Engdegård <ma...@bredband.net>.
6 maj 2013 kl. 23.39 skrev Daniel Shahaf:
> The patch achieves its end in an unmaintainable way. It's liable to
> break when APR's implementation details change, and since it
> ultimately
> attempts to parse something which wasn't intended to be parsed, it
> might
> do the _wrong_ thing (as opposed to just not doing anything) someday.
No, APR is constrained by its API; it cannot change arbitrarily.
> I can't see today what other circumstances APR would call errfn->("%s:
> %s: %c\n", _, "invalid option", _) in, but they don't guarantee us
> that
> there won't be other circumstances so I tend not to assume that they
> won't. That's how I treat Subversion APIs that I write and call, and
> that's how I treat APR too: I rely on promises, not on implementation
> details.
No, the patch will translate "%s: invalid option: %s\n" and a few
other messages as specified by the .po files. There is no way it can
"do the wrong thing".
The worst that can happen is that someone changes the spelling or
formatting of one of the messages in APR. The consequence is then that
this message will be displayed untranslated, and that the translators
may not immediately get to know about it since the changed strings are
not seen by the gettext string extractor. Clearly, this is no worse
than the present situation.
Yes, this patch relies on promises made by the APR - the errfn
signature and semantics.
Should APR eventually contain a more sophisticated error-reporting
interface in the future, the proposed code could of course be retired
in favour of a use of that interface. Until then, the patch would
serve its purpose.
>> Since the patch:
>> 1) solves an annoying problem,
>
> Which no one complained about.
Two translators have pointed out the defect. Either should be
sufficient.
> Keep in mind that error message in the libraries are translated, and
> 'svn help' is translated. Is it really that important to translate
> 'invalid option'? If it is, wouldn't someone have complained about it
> on users@?
Surely the worthiness of bugs to be fixed isn't measured in the number
of complaints to the users@ mailing list?
Of course it's important to translate that message -- if localisation
is important at all, that is.
> Feel free to point me at another place in svn's code that ignores API
> non-promises or parses something that wasn't intended to be parsed.
The suggested patch does not misuse the API. Should APR change,
crashes or corruption cannot occur; the API guarantees that.
Re: Make option parsing error messages translatable
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mattias Engdegård wrote on Sat, May 04, 2013 at 13:55:25 +0200:
> 3 maj 2013 kl. 16.10 skrev Daniel Shahaf:
>
> >It's a hack. It hardcodes the specific first and third arguments
> >to the
> > (os->errfn)("%s: %s: %c\n", _, "invalid option", _);
> >call in APR's unix/getopt.c. The only benefit we get in return is
> >translating six two-word error messages in option parsing.
>
> This "only" benefit is precisely the one intended: to make a small
> set of important error messages translatable.
> Could you please enumerate the negative effects the patch would
> have, in your opinion?
>
The patch achieves its end in an unmaintainable way. It's liable to
break when APR's implementation details change, and since it ultimately
attempts to parse something which wasn't intended to be parsed, it might
do the _wrong_ thing (as opposed to just not doing anything) someday.
I can't see today what other circumstances APR would call errfn->("%s:
%s: %c\n", _, "invalid option", _) in, but they don't guarantee us that
there won't be other circumstances so I tend not to assume that they
won't. That's how I treat Subversion APIs that I write and call, and
that's how I treat APR too: I rely on promises, not on implementation
details.
> Since the patch:
> 1) solves an annoying problem,
Which no one complained about.
Keep in mind that error message in the libraries are translated, and
'svn help' is translated. Is it really that important to translate
'invalid option'? If it is, wouldn't someone have complained about it
on users@?
> 2) does so by adding a small amount of code in a single place,
> 3) with robust failure modes (if anything breaks, we will just fall
> back to the old behaviour),
> 4) without changing the behaviour in any way whatsoever for those
> living in LANG=C,
> 5) a "proper" solution is likely to be a lot more invasive, involve
> cross-project work, require new library versions and certainly would
> not be done for 1.8.x for any x, and most likely will never happen,
>
> it does not seem to be something that should be very controversial,
> should it?
Feel free to point me at another place in svn's code that ignores API
non-promises or parses something that wasn't intended to be parsed.
We don't do that, and we frown upon users who try to do that to us.
(For example, we advised users not to use rsync with FSFS repository for
years before we actually implemented something that caused that to be a
potentially-corruption-causing workflow.)
Re: Make option parsing error messages translatable
Posted by Mattias Engdegård <ma...@bredband.net>.
3 maj 2013 kl. 16.10 skrev Daniel Shahaf:
> It's a hack. It hardcodes the specific first and third arguments to
> the
> (os->errfn)("%s: %s: %c\n", _, "invalid option", _);
> call in APR's unix/getopt.c. The only benefit we get in return is
> translating six two-word error messages in option parsing.
This "only" benefit is precisely the one intended: to make a small set
of important error messages translatable.
Could you please enumerate the negative effects the patch would have,
in your opinion?
Since the patch:
1) solves an annoying problem,
2) does so by adding a small amount of code in a single place,
3) with robust failure modes (if anything breaks, we will just fall
back to the old behaviour),
4) without changing the behaviour in any way whatsoever for those
living in LANG=C,
5) a "proper" solution is likely to be a lot more invasive, involve
cross-project work, require new library versions and certainly would
not be done for 1.8.x for any x, and most likely will never happen,
it does not seem to be something that should be very controversial,
should it?
(That doesn't mean it cannot be improved technically, of course.)
> Anyway, how about this:
>
> static void
> emit_option_error(void *baton, const char *fmt, ...)
> {
> struct { FILE *stream; const char *argv0; apr_pool_t *pool; } *b =
> baton;
> va_list va;
> va_start(va, fmt);
> vfprintf(b->stream, fmt, va);
> va_end(va);
> svn_error_clear(svn_cmdline_fprintf(b->pool,
> _("%s: See '%s --help' or '%s --
> help <subcommand>'"),
> b->argv0, b->argv0, b->argv0));
> return;
> }
That does not address the problem at all; the apr_getopt messages are
still shown untranslated.
Besides, there already is a translated message after a failed option
parse ("Type '%s help' for usage.\n") and it is emitted after the
getopt has returned with an error, where it should be.
Re: Make option parsing error messages translatable
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
C. Michael Pilato wrote on Fri, May 03, 2013 at 08:56:25 -0400:
> On 05/02/2013 04:45 PM, Mattias Engdegård wrote:
> > It is quite annoying that some of the more common error messages, the
> >
> > svn: invalid option: --gurgle
> >
> > kind, are entirely untranslatable because they are printed directly by
> > APR which does not have any concept of translation at all. (This was
> > reported before but met with silence:
> > http://svn.haxx.se/dev/archive-2008-05/1445.shtml)
> >
> > It is not easy to fix it in a clean way; ideally, APR would return a
> > detailed error code together with the required string parameters, but
> > it doesn't. Even if we did change APR to that effect, it would be an
> > incompatibility requiring a new APR version for use with Subversion.
> >
> > This patch, instead, installs an apr_getopt_err_fn_t, acting as a
> > replacement for fprintf. It is admittedly a hack, but quite a
> > localised one -- it doesn't foul up any other code -- and does not
> > have any negative effects that I can think of. In short, it should be
> > a strict improvement.
> >
> > Should APR's getopt error strings change, all that will happen is that
> > they are shown untranslated, just like any other translated strings.
>
> It's a hack, to be sure, but it doesn't strike me as one that carries a
> particularly high penalty. Plus, it's ultimately not much more of a hack
> than we already employ in *much* more commonly executed code -- the code
> which determines which of the svn_error_t's in the chain are "tracing
> errors". (See svn_error__is_tracing_link().)
>
> +1 (from me, at least) to commit. That said, I know danielsh voiced some
> pretty strong opinions the opposite direction about this on IRC, so please
> respect those and see if he's willing to budge before moving forward.
It's a hack. It hardcodes the specific first and third arguments to the
(os->errfn)("%s: %s: %c\n", _, "invalid option", _);
call in APR's unix/getopt.c. The only benefit we get in return is
translating six two-word error messages in option parsing.
Anyway, how about this:
static void
emit_option_error(void *baton, const char *fmt, ...)
{
struct { FILE *stream; const char *argv0; apr_pool_t *pool; } *b = baton;
va_list va;
va_start(va, fmt);
vfprintf(b->stream, fmt, va);
va_end(va);
svn_error_clear(svn_cmdline_fprintf(b->pool,
_("%s: See '%s --help' or '%s --help <subcommand>'"),
b->argv0, b->argv0, b->argv0));
return;
}
Re: Make option parsing error messages translatable
Posted by "C. Michael Pilato" <cm...@collab.net>.
On 05/02/2013 04:45 PM, Mattias Engdegård wrote:
> It is quite annoying that some of the more common error messages, the
>
> svn: invalid option: --gurgle
>
> kind, are entirely untranslatable because they are printed directly by
> APR which does not have any concept of translation at all. (This was
> reported before but met with silence:
> http://svn.haxx.se/dev/archive-2008-05/1445.shtml)
>
> It is not easy to fix it in a clean way; ideally, APR would return a
> detailed error code together with the required string parameters, but
> it doesn't. Even if we did change APR to that effect, it would be an
> incompatibility requiring a new APR version for use with Subversion.
>
> This patch, instead, installs an apr_getopt_err_fn_t, acting as a
> replacement for fprintf. It is admittedly a hack, but quite a
> localised one -- it doesn't foul up any other code -- and does not
> have any negative effects that I can think of. In short, it should be
> a strict improvement.
>
> Should APR's getopt error strings change, all that will happen is that
> they are shown untranslated, just like any other translated strings.
It's a hack, to be sure, but it doesn't strike me as one that carries a
particularly high penalty. Plus, it's ultimately not much more of a hack
than we already employ in *much* more commonly executed code -- the code
which determines which of the svn_error_t's in the chain are "tracing
errors". (See svn_error__is_tracing_link().)
+1 (from me, at least) to commit. That said, I know danielsh voiced some
pretty strong opinions the opposite direction about this on IRC, so please
respect those and see if he's willing to budge before moving forward.
--
C. Michael Pilato <cm...@collab.net>
CollabNet <> www.collab.net <> Enterprise Cloud Development