You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Manuel Mall <mm...@arcus.com.au> on 2005/11/07 08:24:14 UTC
Is getNextKnuthElements the right interface for inline LMs?
As you know I am looking into the white space handling and this has now
expanded into Unicode linebreaking, handling of Unicode formatting
characters (e.g. ZWSP), get a handle on all the different break
scenarios and their related Knuth sequences, Joerg threw glyph
merging / substitution into the mix, and then we have l-r writing modes
and BIDI.
What I observed is that most of these issue cannot be solved by looking
at a single character at a time. They need context, very often only one
character, sometimes more (e.g. sequence of white space). More
importantly the context needed is not limited to the fo they occur in.
They all span across fos. This is were the current LM structures and
especially the getNextKnuthElement interface really gets in the way of
things. Basically one cannot create the correct Knuth sequences without
the context but the context can come from everywhere (superior fo,
subordinate fo, or neighboring fo). So one needs look ahead and
backtrack features across all these boundaries and it feels extremely
messy.
It appears conceptually so much simpler to have only a single loop
interating over all the characters in a paragraph doing all the
character/glyph manipulation, word breaking (hyphenation), and line
breaking analysis and generation of the Knuth sequences in one place.
An example where this is currently done is the white space handling
during refinement. One loop at block level based on a recursive char
iterator that supports deletion and character replacement does the job.
Very simple and easy to understand. I have something similar in mind
for inline Knuth sequence generation. Of course the iterator would not
only return the character but relevant formatting information for it as
well, e.g. the font so the width etc. can be calculated. The iterator
may also have to indicate start/end border/padding and conditional
border/padding elements.
Of course that would be quite a change internally although limited to
inline LMs and not affecting any block level operations. The way to do
this would be a branch in svn. But before I embark on such an endeavour
I'll like to seek some feedback on the list. Anyone aware of serious
problems with such an approach? Has it been tried before and failed for
example? Those who designed the current getNextKnuth approach may have
arguments why changing it for inline LMs is a bad idea? Any other
views / concerns?
Thanks
Manuel
Re: Is getNextKnuthElements the right interface for inline LMs?
Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 07.11.2005 08:24:14 Manuel Mall wrote:
<snip/>
> Of course that would be quite a change internally although limited to
> inline LMs and not affecting any block level operations. The way to do
> this would be a branch in svn. But before I embark on such an endeavour
> I'll like to seek some feedback on the list. Anyone aware of serious
> problems with such an approach?
No.
> Has it been tried before and failed for example?
We had to change a few things during the transition to the Knuth
approach. Sometimes, changes are necessary and it makes no sense to
stubbornly stick to what is already there.
> Those who designed the current getNextKnuth approach may have
> arguments why changing it for inline LMs is a bad idea?
I have none. You seem to have good arguments for changing the interface.
Still, care should be taken that the LMs stay as uniform as possible so
it's possible to add layout managers for custom elements and that
non-character content is handled well and without too much custom logic
because the changed approach focuses strongly on text.
> Any other views / concerns?
The above said, it should be noted that I haven't dived, yet, into the
Unicode stuff you've been discussing lately. I'm very happy about the
flurry of activity in this area. It looked like a good discussion. I
hope you will excuse me if I don't participate too much there right now.
Jeremias Maerki