You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by patrick andries <pa...@videotron.ca> on 2002/10/09 18:25:39 UTC

Fw: Arabic characters and FOP

I'm willing to help (slowly) implementing the bidi support
(contextualisation and bidi algorithms). Is someone else busy doing it ? Is
the time ripe to do so ?

> > ----- Message d'origine -----
> > De : <sa...@daimlerchrysler.com>
> >
> >
> >
> > (See attached file: example.fo)
> > (See attached file: fo.gif)
> > hi pat,
> > i am not using any bidi enabled editor, i just typed the fo using text
> > editor
>
>  [PA] I see, you are typing character references entities.
>
> >  and view it in IE
>
>  [PA] Well, IE is bidi-enabled !
>
>  [PA] I suspect to print it with FO, bidi needs to be implemented in FO.
> (I'm
>  still a volunteer to do it ;-))
>
>  Patrick Andries
>  o - 0 - 0
>  Tout Unicode en français
>  Nouveaux textes !
>  http://hapax.iquebec.com
>
> >
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
----- Message d'origine -----
De : "Peter B. West" <pb...@powerup.com.au>
Envoyé : 10 oct. 2002 20:40



> Patrick,
>
> I'm just trying to determine the limits.  At the moment 1.3 is needed to
> compile, because of TrueType font support, although users may continue
> to run the results in their existing 1.2 environments.  There has been
> vigorous discussion for as long as I have subscribed to this list about
> migration to later JDK versions.

This is perfectly justifiable for me, I just wanted this clarified.



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by "Peter B. West" <pb...@powerup.com.au>.
Patrick,

I'm just trying to determine the limits.  At the moment 1.3 is needed to 
compile, because of TrueType font support, although users may continue 
to run the results in their existing 1.2 environments.  There has been 
vigorous discussion for as long as I have subscribed to this list about 
migration to later JDK versions.

Peter

patrick andries wrote:
> ----- Message d'origine -----
> De : "Oleg Tkachenko" <ol...@multiconn.com>
> 
>>So, you right about jdk1.3 - it remains to be seen, another alternatives
> 
> could
> 
>>be our own TR9 implementation (afaik renderx guys went this way last
> 
> summer,
> 
>>probably because they have to support jdk1.1 and ms jvm) or some
> 
> third-party
> 
>>implementation, e.g. ICU4J.
> 
> 
> They also claim to have only limited (or basic) bidi support, if I recall
> properly.
> 
> Did I understand properly Peter that FOP 1.0 should support Jdk 1.3 ?


-- 
Peter B. West  pbwest@powerup.com.au  http://www.powerup.com.au/~pbwest/
"Lord, to whom shall we go?"


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
----- Message d'origine -----
De : "Oleg Tkachenko" <ol...@multiconn.com>
> So, you right about jdk1.3 - it remains to be seen, another alternatives
could
> be our own TR9 implementation (afaik renderx guys went this way last
summer,
> probably because they have to support jdk1.1 and ms jvm) or some
third-party
> implementation, e.g. ICU4J.

They also claim to have only limited (or basic) bidi support, if I recall
properly.

Did I understand properly Peter that FOP 1.0 should support Jdk 1.3 ?







---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by Oleg Tkachenko <ol...@multiconn.com>.
Peter B. West wrote:

> How is bidi support accessed in 1.3?  Must you make your own 
> determinations of directionality, or has the CDB BIDI data also been 
> smuggled in?
Well, some bidi support was in swing packages a long time ago but it seems to 
be intended entirely for system usage, e.g. javax.swing.text.Bidi class in 
jdk1.3 has package access level. Another one - sun.awt.font.Bidi is in Sun 
proprietary package but is public. It's not so rich as jdk1.4's one,
among methods it has createLineBidi, getVisualToLogicalMap, getLevels etc and 
at least it knows all these LRM, RLM, PDF, LRO stuff.
So, you right about jdk1.3 - it remains to be seen, another alternatives could 
be our own TR9 implementation (afaik renderx guys went this way last summer, 
probably because they have to support jdk1.1 and ms jvm) or some third-party 
implementation, e.g. ICU4J.

-- 
Oleg Tkachenko
eXperanto team
Multiconn International, Israel


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by "Peter B. West" <pb...@powerup.com.au>.
Oleg,

Thanks for this.  I just had a look at java.text.Bidi, and was led 
inexorably into java.awt.font, and from there back to bedrock - 
java.lang.Character.  It's Wonderland in there, and I haven't had time 
to digest it, but it looks as though all of the attributes from the 
Unicode character database are now encoded in java.lang.Character.  I.e. 
the CDB is available in 1.4.  It was also interesting to see that a 
significant subset of the CDB has been available since 1.1, notable 
omissions being the BIDI characteristics.

How is bidi support accessed in 1.3?  Must you make your own 
determinations of directionality, or has the CDB BIDI data also been 
smuggled in?

Peter

Oleg Tkachenko wrote:
> patrick andries wrote:
> 
>> I'm willing to help (slowly) implementing the bidi support
>> (contextualisation and bidi algorithms). Is someone else busy doing it 
>> ? Is
>> the time ripe to do so ?
> 
> Implementation of bidi algorithm itself is not a problem as java has 
> already bidi support since jdk1.3 (it's hidden in 1.3 and revealed in 
> 1.4 in form of java.text.Bidi class). So I believe there are no 
> obstacles for redesigned fop to implement bidi support.
> 
> PS.Actually it's even feasible to produce hebrew pdf using fop right now 
> (well, under certain circumstances).


-- 
Peter B. West  pbwest@powerup.com.au  http://www.powerup.com.au/~pbwest/
"Lord, to whom shall we go?"


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by "Peter B. West" <pb...@powerup.com.au>.
Patrick,

It remains to be seen how the 1.4 BIDI handling can be integrated with 
FOP's layout, and how BIDI processing can be gracefully declined for 
users not running 1.4.  As Unicode and i18n are your strengths, I think 
your input on this would be very valuable.  Perhaps Oleg is already 
considering the issues, in which case you could work together.

Peter

patrick andries wrote:
> ----- Message d'origine -----
> De : "Oleg Tkachenko" <ol...@multiconn.com>
> 
> 
>>patrick andries wrote:
>>
>>>Good, no need to help thus.
>>
>>I didn't say that! Volunteers are desperately needed, they are blood of
> 
> fop
> 
>>project, so if you are willing - you are welcome. Look at today's "New
>>Developer Suggestion" thread, for example.
> 
> 
> Okay, okay. I will have a look at it.
> 
> My own strengths are Unicode, i18n (I still believe you must understand what
> you are doing and test the different scripts even if Java does most of the
> job) and fonts (OpenType for instance(*)). I would like to help (not
> alone...) give the best support in this area, but I'm willing to help
> (slowly) in other areas in the mean time if the new code is not yet ready to
> add these features. I will have a look at the todo list and the state of the
> code and come back (privately) when I have some time.
> 
> Patrick Andries
> (member of the Canadian character set committee
> translator for the ISO JT/SC2/GT2)
> http://hapax.iquebec.com

-- 
Peter B. West  pbwest@powerup.com.au  http://www.powerup.com.au/~pbwest/
"Lord, to whom shall we go?"


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
----- Message d'origine -----
De : "J.Pietschmann" <j3...@yahoo.de>
À : <fo...@xml.apache.org>
Envoyé : 9 oct. 2002 17:27
Objet : Re: Fw: Arabic characters and FOP


> patrick andries wrote:
> >>There are also various code points assigned
> >>to ligatures and presentation forms, for example U+FB01, which
> >>could be used in the FO source (at the risk of confusing
> >>hyphenation, spell checkers and others).
>
> > Not a good idea, these code points are deprecated. Ligatures are glyphs
not
> > characters, Unicode is about characters (yes, I know there are
"historical"
> > and compatibility exceptions)
> I should have added "drawing the wrath of the Unicode folks" to the
> risks :)
>
> > Also, some ligatures are purely discretionary (like the ligated fi you
> > mentioned in U+FB01). This behaviour should be driven by some styling
> > information, I would assume ("I want a nice ffl ligature here if present
in
> > the font,  and here a ct ligature if present). I do not know of any
> > available means to specify this. The same is true for glyph variants (I
> > would like this particular ampersand variant).
> Variants should probably represented by different fonts. I *hope* fonts
> which have glyph variants for certain characters are rare enough...

They will be more and more of them with OpenType.

> I think ligatures could explicitely prevented by inserting some zero width
> characters (non-breaking spaces or joiners?).

Yes, but this does not allow to select many different behaviours.

> > What are the CSS people doing about this ?
> It seems there are more pressing problems to solve. I'm not familiar
> with recent CSS3 developments though.

Well, it depends on your constituency : OpenType is very valuable to
non-latin scripts and to fine latin typography.

> >>Also, the discussion whether presentation forms have to be
> >>expressed by the characters itself or out of band, for example
> >>as fonts, has never ended.
> >
> > Unicode is quite plain about this, I believe  it even states somewhere
that
> > the Arabic presentations forms were a bad idea .
> Yes, Unicode is explicit about this. But there is still a sizeable
> fraction left which thinks otherwise...

Well, as long as they use Unicode ;-)  This is also the philosophy adopted
by OpenType.

But we can leave that to later and follow what other standards will be
coming up with for finer controls.


P. Andries
- o - O - o -
Unicode en français : http://hapax.iquebec.com



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by "J.Pietschmann" <j3...@yahoo.de>.
patrick andries wrote:
>>There are also various code points assigned
>>to ligatures and presentation forms, for example U+FB01, which
>>could be used in the FO source (at the risk of confusing
>>hyphenation, spell checkers and others).

> Not a good idea, these code points are deprecated. Ligatures are glyphs not
> characters, Unicode is about characters (yes, I know there are "historical"
> and compatibility exceptions)
I should have added "drawing the wrath of the Unicode folks" to the
risks :)

> Also, some ligatures are purely discretionary (like the ligated fi you
> mentioned in U+FB01). This behaviour should be driven by some styling
> information, I would assume ("I want a nice ffl ligature here if present in
> the font,  and here a ct ligature if present). I do not know of any
> available means to specify this. The same is true for glyph variants (I
> would like this particular ampersand variant).
Variants should probably represented by different fonts. I *hope* fonts
which have glyph variants for certain characters are rare enough...
As for ligatures, AFAIK they follow established rules, and therefore
you have basically four options:
- professional: follow the established rules as far as possible
- artistic: make your own rules and follow them
- plain: never do ligatures
- lousy: random behaviour

I think ligatures could explicitely prevented by inserting some zero width
characters (non-breaking spaces or joiners?).

> What are the CSS people doing about this ?
It seems there are more pressing problems to solve. I'm not familiar
with recent CSS3 developments though.

>>Also, the discussion whether presentation forms have to be
>>expressed by the characters itself or out of band, for example
>>as fonts, has never ended.
> 
> Unicode is quite plain about this, I believe  it even states somewhere that
> the Arabic presentations forms were a bad idea .
Yes, Unicode is explicit about this. But there is still a sizeable
fraction left which thinks otherwise...

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
----- Message d'origine -----
De : "J.Pietschmann" <j3...@yahoo.de>
À : <fo...@xml.apache.org>
Envoyé : 9 oct. 2002 15:30
Objet : Re: Fw: Arabic characters and FOP


> patrick andries wrote:
>  > (*) Does anaybody know how glyphs variants and ligatures (see the
>  > substitution feature in Opentype) should be selected from fo ? I
believe
>  > there is currently no such mechanism. Should we wait until another
version of
>  > XSL-FO? Extensions ?
>
> IIRC the spec mentions it's at the whim of the processor to
> provide ligatures.
> There are also various code points assigned
> to ligatures and presentation forms, for example U+FB01, which
> could be used in the FO source (at the risk of confusing
> hyphenation, spell checkers and others).

Not a good idea, these code points are deprecated. Ligatures are glyphs not
characters, Unicode is about characters (yes, I know there are "historical"
and compatibility exceptions)

> If such characters
> are mapped to glyphs by a font, FOP can handle them.

The idea with OpenType (the merging of PS1 and TTF fonts) is to do allow to
render these ligatures at rendering time (as is necessary with many
non-latin based scripts), i.e. within the glyph space.

Also, some ligatures are purely discretionary (like the ligated fi you
mentioned in U+FB01). This behaviour should be driven by some styling
information, I would assume ("I want a nice ffl ligature here if present in
the font,  and here a ct ligature if present). I do not know of any
available means to specify this. The same is true for glyph variants (I
would like this particular ampersand variant). What are the CSS people doing
about this ? Should we follow them ?

> Also, the discussion whether presentation forms have to be
> expressed by the characters itself or out of band, for example
> as fonts, has never ended.

Unicode is quite plain about this, I believe  it even states somewhere that
the Arabic presentations forms were a bad idea . This is at least what the
technical director of the Unicode consortium said in an interview I
conducted (in French though :
http://iquebec.ifrance.com/hapax/pdf/whistler.pdf
« Ceci s'est déjà produit : ainsi aucune mise en ouvre Unicode de l'arabe ne
se préoccupe du grand nombre de ligatures arabes codées dans Unicode,
ligatures dont l'inclusion a constitué une erreur. Les implantations arabes
correctes utilisent les caractères arabes de base et étendus et délèguent la
formation des ligatures aux polices, comme il se doit.»)


P. Andries



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by "J.Pietschmann" <j3...@yahoo.de>.
patrick andries wrote:
 > (*) Does anaybody know how glyphs variants and ligatures (see the
 > substitution feature in Opentype) should be selected from fo ? I believe
 > there is currently no such mechanism. Should we wait until another version of
 > XSL-FO? Extensions ?

IIRC the spec mentions it's at the whim of the processor to
provide ligatures. There are also various code points assigned
to ligatures and presentation forms, for example U+FB01, which
could be used in the FO source (at the risk of confusing
hyphenation, spell checkers and others). If such characters
are mapped to glyphs by a font, FOP can handle them.

Also, the discussion whether presentation forms have to be
expressed by the characters itself or out of band, for example
as fonts, has never ended. XSLFO obviously provides font
specification, how presentational forms expressed as Unicode
characters have to be handled is not mentioned at all in the
spec and presumably left to the processor.

It shouldn't be all that much work to provide mappings from
presentational forms to canonical code points and hook them into the
character lookup: if the lookup for the currently selected fonts fails,
the mapping kicks in, perhaps selecting a new first priority font and
repeating the lookup with the mapping result. This could also
easily map &#xFB01; to "fi". The other way around ("fi" -> &#xFB01;)
is much more difficult, because of overlapping mappings needing
look-aheads (the ff and ffi ligatures) and irregularities, for
example "auffinden" does not use the ffi ligature.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: Fw: Arabic characters and FOP

Posted by Victor Mote <vi...@outfitr.com>.
Patrick Andries wrote:

> (*) Does anaybody know how glyphs variants and ligatures (see the
> substitution feature in Opentype) should be selected from fo ? I believe
> there is currently no such mechanism. Should we wait until another version
> of XSL-FO? Extensions ?

If I understand the OpenType standard properly, it is supposed to do most of
this automatically (it seems to in some applications, but I am not sure
whether it is the font or the application doing the work). The question may
actually become how to turn it off if you don't want it. See Section 7.8.1
for a discussion of how this ties in with XSL-FO. This is one reason why I
am working on trying to get our font support upgraded to handle at least
OpenType fonts that are registered at the O/S level (see the "fonts" thread
over the past few days). In the meantime, and for other font types, I
suppose a lot of this could be done with some regular expression
preprocessing -- it might even be possible to do this within the parser or
XSLT transformer.

Victor Mote


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
----- Message d'origine -----
De : "Oleg Tkachenko" <ol...@multiconn.com>


> patrick andries wrote:
> > Good, no need to help thus.
> I didn't say that! Volunteers are desperately needed, they are blood of
fop
> project, so if you are willing - you are welcome. Look at today's "New
> Developer Suggestion" thread, for example.

Okay, okay. I will have a look at it.

My own strengths are Unicode, i18n (I still believe you must understand what
you are doing and test the different scripts even if Java does most of the
job) and fonts (OpenType for instance(*)). I would like to help (not
alone...) give the best support in this area, but I'm willing to help
(slowly) in other areas in the mean time if the new code is not yet ready to
add these features. I will have a look at the todo list and the state of the
code and come back (privately) when I have some time.

Patrick Andries
(member of the Canadian character set committee
translator for the ISO JT/SC2/GT2)
http://hapax.iquebec.com

(*) Does anaybody know how glyphs variants and ligatures (see the
substitution feature in Opentype) should be selected from fo ? I believe
there is currently no such mechanism. Should we wait until another version
of XSL-FO? Extensions ?



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by Oleg Tkachenko <ol...@multiconn.com>.
patrick andries wrote:
> Good, no need to help thus.
I didn't say that! Volunteers are desperately needed, they are blood of fop 
project, so if you are willing - you are welcome. Look at today's "New 
Developer Suggestion" thread, for example.

-- 
Oleg Tkachenko
eXperanto team
Multiconn International, Israel


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by patrick andries <pa...@videotron.ca>.
Good, no need to help thus.

----- Message d'origine -----
De : "Oleg Tkachenko" <ol...@multiconn.com>
À : <fo...@xml.apache.org>
Envoyé : 9 oct. 2002 13:51
Objet : Re: Fw: Arabic characters and FOP


> patrick andries wrote:
> > I'm willing to help (slowly) implementing the bidi support
> > (contextualisation and bidi algorithms). Is someone else busy doing it ?
Is
> > the time ripe to do so ?
> Implementation of bidi algorithm itself is not a problem as java has
already
> bidi support since jdk1.3 (it's hidden in 1.3 and revealed in 1.4 in form
of
> java.text.Bidi class). So I believe there are no obstacles for redesigned
fop
> to implement bidi support.
>
> PS.Actually it's even feasible to produce hebrew pdf using fop right now
> (well, under certain circumstances).
> --
> Oleg Tkachenko
> eXperanto team
> Multiconn International, Israel
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Fw: Arabic characters and FOP

Posted by Oleg Tkachenko <ol...@multiconn.com>.
patrick andries wrote:
> I'm willing to help (slowly) implementing the bidi support
> (contextualisation and bidi algorithms). Is someone else busy doing it ? Is
> the time ripe to do so ?
Implementation of bidi algorithm itself is not a problem as java has already 
bidi support since jdk1.3 (it's hidden in 1.3 and revealed in 1.4 in form of 
java.text.Bidi class). So I believe there are no obstacles for redesigned fop 
to implement bidi support.

PS.Actually it's even feasible to produce hebrew pdf using fop right now 
(well, under certain circumstances).
-- 
Oleg Tkachenko
eXperanto team
Multiconn International, Israel


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org