You are viewing a plain text version of this content. The canonical link for it is here.
Posted to embperl@perl.apache.org by Robert <ro...@robert.cz> on 2001/11/26 12:54:49 UTC
Multilingual pages in Embperl

Hi,

  I guess lots of people need to create multilingual web pages, but I
don't remember any discussion about how to do it best in Embperl. Here I
describe my solution and a couple of comments, please feel free to jump
in and add your experiences, I for one am very interested.

  There's more or less standard GNU gettext package and there's
experimental Locale::PGettext.pm pure-perl version of it. They both
start with marking terms to be translated, typically gettext('Hello!')
or _(Hello!) and a simple utility to extract them from the sources. I
write either [+ _ 'Hello!' +] or [+ gettext 'Hello!' +] in Embperl and
have my own utility to extract them. Then both the C and Perl gettext
build a DBM-based dictionary and use it to lookup translations in
runtime. Here I think Embperl can do better: Perl is said to have a
quite good hash implementation and modperl let us keep stuff easily in
memory, so why don't we just preload MyApp::Gettext.pm with something
like (simplified):

	unless defined %dict {
		%dict = ... read dictionary from disk
	}

	sub gettext {
		$dict{$_[0]}{$fdat{lang} || 'en'}
	}

	sub _ { 
		gettetxt $_[0] 
	}

  I haven't tested this preloading much yet, but it's not really
necessary (I don't care about performace just yet) and even without it
it works fine.

  Now, given I can keep the dictionaries in whatever format I want,
there's no need to fool around with utilities for managing DBM stuff -
after source code change I just rerun text extraction that add new terms
to the dictionary with default translation '???' and send it to
translator, they can search for '???' string and fill new translations
and we're done.

  There're some problems, of course:

1) Word order:
	[+ $found +] [+ _ 'records found' +]
  must become
	[+ sprintf gettext('%d records found'), $n +]
  so it can work in languages like Czech with different word order
('Nalezeno %d zaznamu')

2) Grammar:
	[$ if ($n > 1) $]
		[+ sprintf gettext('%d records found'), $n +]
	[$ elsif ($n == 1) $]
		[+ sprintf gettext('%d record found'), $n +]
	[$ else $]
		[+ sprintf gettext('No record found') +]
	[$ endif $]
  In Czech we have one more variant for $n in 2..4, other languages can
be even worse... I have an idea how to partially fix this, but never
really got around to test it.

3) Context:
  Sometimes the word appears in two different context and should be
translated differently. Don't know yet what to do with it.

  And probably more but this mail is already long as it is. Looking
forward to your comments.

- Robert

---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org