You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Nicolás Lichtmaier <ni...@reloco.com.ar> on 2004/03/30 03:55:34 UTC

I18n: The gettext proposal

Hi, I propose to use gettext for i18n.

First some info about how does gettext work.

Gettext translates user messages at runtime. This is done by the 
gettext() function, which takes a string and provides its localized 
counterpart. The "weird" thing about gettext is that it uses the English 
message as the look-up key, not a number, not an ID, but the real and 
usable English message. Normaly you won't see "gettext()" in the code, 
because this define is used instead:

#define _(x) gettext(x)

...so what you would see is:

printf(_("I'm a localized message and I'm proud of it"));

Once all the localizable strings are marked with _(), then it's time to 
run xgettext. This tool will scan a list of files and product just one 
archive: subversion.pot. This archive contains entries, and each entry 
looks like this one:

#: subversion/clients/cmdline/blame-cmd.c:145
#, c-format
msgid "Skipping binary file: '%s'\n"
msgstr ""


The subversion.pot file contains every localizable message, and serves 
as a starting point for new localizations. A new localization starts by 
copying this file to (e.g.) es.po. There, the translator fills the empty 
string with the proper translation:

#: subversion/clients/cmdline/blame-cmd.c:145
#, c-format
msgid "Skipping binary file: '%s'\n"
msgstr "Omitiendo el archivo binario: '%s'\n"

The .pot file is never edited by hand, it's always overwritten by 
xgettext. There's a tool calles msgmerge which will merge a new 
subversion.pot with an old XX.po. This tool is very smart, it even 
suggests translations for the new messages if they look too similar to 
old messages.

Each of these XX.po (where XX is an ISO language code) is compiled by 
msgfmt to produce a .mo file, which is installed to 
/usr/share/locale/XX/LC_MESSAGES/subversion.mo . So if Subversion 
supported 3 additional languages, it would ship 3 .mo files, which would 
be used at runtime according to the user's locale settings.

In a system there might be several programs using gettext, they don't 
collide between each other because they name differently their .mo 
files. This name is called domain. A domain is a set of messages which 
need translation. We would create a "subversion" domain.

I told you before about gettext. In fact, this function is not the one 
used in the patch I've sent. Why? Because of svn being also a library. 
When being a library the process is configured to use some other domain. 
Suppose that gnumeric added subversion support. Gnumeric already uses 
gettext, with a "gnumeric" domain (so its files are in 
/usr/share/locale/XX/LC_MESSAGES/gnumeric.mp). We don't want to use this 
domain, but we don't want to disturb the main application wither. So we 
use dgettext, which is version of the function which enables one to pass 
the domain in each invocation. So the different svn libraries will 
provide proper localized messages, even when ran inside some other 
application.

To properly implement i18n some things will need to change:
It's imposible to handle plurals like this:
printf("%d file%s", n, n>1 ? "s": "");
The gettext function provides its own way of handling this, which 
implies having the two versions of this message separated: "%d file" and 
"%d files" (look for ngettext in the manual).

Other thing is this:
printf("We are at revision " I_M_THE_REV_FMT "\n", rev);
This clashes with the gettext cheme because it breaks the xgettext 
scanning. xgettext doesn't process includes and can't figure out what 
the message is supposed to be. One way to fix this if we can't get rid 
of the define is to format the number in a separate step, probably in an 
auxiliar function:

printf("We are at revision %s\n", fmt_rev(buf, rev));

One thing which is still unresolved is how to translate server messages. 
I think these should be handled (in DAV) with the Accept-language: 
header. This header will tell the server which locales the client is 
willing to accept, so that the server can choose the proper .mo and 
serve the right messages. This would be implemented in a second stage 
since there are some issues that should be resolved.

Well, I'll end this for now. This mail has got too long.

Bye! =)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org