You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by "J.Pietschmann" <j3...@yahoo.de> on 2003/04/24 23:31:48 UTC

How layout works

Hi all,
it might be interesting to write up again how the layout core works
in the old and in the new code.

In the old code, most of the layout functionality is concentrated
in the layout() methods of the FOs. Also most of the necessary state
is concentrated as fields in the FOs. The most notable exception to
these rules is LineArea.addText() and related functionality, obviously
because there is no FO corresponding to a line. Another expception
is table layout, mainly because of row and column spanning cells.
There are also a few relevant code parts sprinkled unnecessarily
and unintuitively elsewhere, for example some important break-before
logic is attached to the property, and there are some static (!)
methods important for footnotes are attached to an area class.

The layout can be roughly described by this pseudo code:
   setup an initial state for the FO's layout process
   create an area for the FO, if applicable and necessary
   if there are children
     foreach child
       layout child
       if keep/other conditions violated
         rollback layout
       else if line|column|page is full
         create a new area for the FO
     end foreach
   finalize area and add to area tree
The layout rollback is the greatest problem, you need
- reset the FO's state to a snapshop
- reset global state: Ids, out-of-line areas (fortunately,
   no floats yet), markers.
Neither is cleanly implemented in the old code which is a recurrent
source of famous and infamous bugs, but this could be cured.
The next problem is that a page is fed to the renderer, which makes
handling conditions influencing more than one page impossible.
Granted, every example I saw which *required* a look-ahead of more
than one page for correct layout was artificially constructed, but
this does not mean they wont ever occur in the wild.

In the new code the layout is done by layout managers. Basically
every FO has a layout manager, additionally there are managers
for lines, columns and pages.
Instead of creating and filling areas, the new code is organized
around computing breaks. A layout manager has three important
methods
  1. initialize layout manager state (void)
  2. get next break possibility (param: layout context)
       foreach child layout manager
         loop
           get next break possiblity from child layout manager
           if this break possibility makes a valid break
             return it
          end loop
       end foreach
  3. get areas (param: next break possiblity) for FO creating areas
       create area
       for break possiblity
             from last saved break possiblity
             to next break possiblity
          get areas from break possibility {actually from some
            child layout managers}
           add areas to area
      end for
      return area
For FOs which don't create normal areas, method 3 returns the areas
in the loop directly.
You'll probably see the problems:
- control flow oszillates between lots of objects
- the layout state is distributed over lots of objects: the layout manager
   and its numerous super classes, the break possiblities, the position
   objects encapsulated by break possibility objects, various iterators
   and to some extend the layout context passed around
Also, the iterator classes for retrieving layout managers and break
possiblities don't operate on explicit or even tangible collections, which
makes my brain hurt (but I've seen worse examples of this syndrome,
iterators seems to be attractive). Note that while there are lists of break
possiblities, the BreakPossPosIterator iterates over a tree of break
possiblities.
In general, I like the idea of creating break possiblities first and see
whether they create break possiblities for the parent, because this makes
backtracking easier: instead of unrolling ready-to-render layout just
restart at a previous break possibility with a new layout context. An
interesting side effect is that for certain renderers the area tree
doesn't have to be constructed explicitely, you can try to render the
areas right after they are constructed, provided the break possibility
provides enough information (and there is no forward page number
reference).

My approach would be
1. get rid of all the iterators and move the functionality to the
    layout managers because layout managers currently store many
    of the same info anyway.
2. get rid of position objects and fold their data into break possiblities
3. make a proper hierarchy of break possibility objects
4. move stacking info to the layout context and make proper snapshots.
5. eventually get rid of the layout managers which correspond 1-to-1
    to FOs and move the functionality back to the FOs.
This is however not a small task, and I'm reluctant to start partly because
inquiries to Keiron and Karen for rationales of the current design didn't
result in anything, not even a statement like "It looked good"
Also IDs, markers and out-of-line areas will need some more attention,
the mapping from IDs to pages/areas for example should be stored in either
athe next break possiblity or a layout context snapshot.

Comments?

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: How layout works

Posted by Victor Mote <vi...@outfitr.com>.
J.Pietschmann wrote:

> Hair-pulling painful. Imagine the following:
>   <fo:flow writing-mode="lr-tb">
>     <fo:block-container writing-mode="tb-lr">
>       <fo:block>lots of gibberisch</fo:block>
>     </fo:block-container>
>     <fo:block>more gibberisch</fo:block>
> I'm not sure what the spec really says but most people will assume that
> the layout will minimize the BPD of the block container (which is the
> IPD of the block therein), so that the area is well filled and possibly
> the block following the block container can start on the page.

I understand what you are saying. Perhaps this should be addressed with an
extension that allows the user to specify which dimension to fill first. In
the mean time, it seems reasonable to follow the assumption given and move
forward (?). Only if we see places where our design is not flexible to
handle anything that is thrown at it would I be concerned.

> If this isn't painful enough, add a tb-lr footnote:
>   <fo:flow writing-mode="lr-tb">
>     <fo:block-container writing-mode="tb-lr">
>       <fo:block>lots of gibberisch
>             <fo:footnote>...</fo:footnote>
>      more stuff
>      </fo:block>
>     </fo:block-container>
>     <fo:block>more gibberisch</fo:block>
> Happy balancing! Side floats in this context have similarly detrimental
> influence on your mental health.

I guess my view on these more esoteric issues is that balancing and
beautifying is less important than getting the basics working. If we give
the user /something/ here that meets the spec, then they can give us or the
standard-writers instructions on what needs to be changed to implement this
better. Again, our implementation doesn't have to be perfect right away,
just our design :-).

> Branch-and-bound or damped iterations will deal with such problems,
> but at great cost.

We'll use good design, efficient algorithms, give the user choices on the
quality / performance continuum, and warn them about expensive operations.
Beyond that, the cost belongs to them. If they want to design documents that
are expensive to build, then they are going to be expensive to build. If it
is less costly for them to write them out by hand and send them by snail
mail, then so be it.

Victor Mote


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: How layout works

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Victor Mote wrote:
> By show-stopper, do you mean "impossible" or merely "nearly impossible, and
> hair-pulling painful"?

Hair-pulling painful. Imagine the following:
  <fo:flow writing-mode="lr-tb">
    <fo:block-container writing-mode="tb-lr">
      <fo:block>lots of gibberisch</fo:block>
    </fo:block-container>
    <fo:block>more gibberisch</fo:block>
I'm not sure what the spec really says but most people will assume that
the layout will minimize the BPD of the block container (which is the
IPD of the block therein), so that the area is well filled and possibly
the block following the block container can start on the page.

If this isn't painful enough, add a tb-lr footnote:
  <fo:flow writing-mode="lr-tb">
    <fo:block-container writing-mode="tb-lr">
      <fo:block>lots of gibberisch
            <fo:footnote>...</fo:footnote>
     more stuff
     </fo:block>
    </fo:block-container>
    <fo:block>more gibberisch</fo:block>
Happy balancing! Side floats in this context have similarly detrimental
influence on your mental health.

Side floats alone are not that hard too handle, unless orphan/widows and
other keep conditions complicate life. It can be made considerably more
misarable by letting side-floats occur en masse, preferably flowing to
different edges. Stacking side floats properly is something I'd implement
even after automatic table column withs.
I forgot multi-column layouts with all-column spans on the list of hard
problems. Get mad and build a file with multi-column layout, several
all-column spans, side-float clusters, footnotes and mixed writing modes
all on one page.

Branch-and-bound or damped iterations will deal with such problems,
but at great cost.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: How layout works

Posted by Victor Mote <vi...@outfitr.com>.
J.Pietschmann wrote:

> > Knuth has
> > laid out the theory, algorithms, and implementation in such
> detail on the
> > line-breaking logic, that it seems (to my optimistic mind) for
> the other to
> > be in reach as well.
>
> Show stoppers: writing modes and side floats. Tables with automatically
> determined column widths are still expensive, especially if spanning pages
> with page masters with different body width (the spec doesn't talk about
> this, for good reason).

By show-stopper, do you mean "impossible" or merely "nearly impossible, and
hair-pulling painful"? All of these seem possible with this kind of scheme,
although none seem easy, and will probably seem even harder in practice. On
the "expensive" issue, I guess after we design it as efficiently as
possible, and handle serialization of intermediate data as necessary, that
becomes a user problem. Also, on this particular issue, I have no problem
with (at least for now), throwing up our hands & saying "we're not going to
do that!" I suppose the next best thing is to take the smallest of the page
widths & use that for all of them.

BTW, I am not entirely sure that Knuth's algorithm can be extrapolated for
this purpose -- I am going on intuition here -- it seems like it should be
doable.

Victor Mote


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: How layout works

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Victor Mote wrote:
> I read somewhere that Peter Karow (I think) was trying to extrapolate TeX's
> paragraph-oriented line-breaking algorithm to a scheme that would optimize
> layout for a chapter (page-sequence to us). Until talked out of it, I have
> tentatively adopted this as my opinion of the "right" approach.
+1

> Knuth has
> laid out the theory, algorithms, and implementation in such detail on the
> line-breaking logic, that it seems (to my optimistic mind) for the other to
> be in reach as well.

Show stoppers: writing modes and side floats. Tables with automatically
determined column widths are still expensive, especially if spanning pages
with page masters with different body width (the spec doesn't talk about
this, for good reason).

> My apologies for the long post.
No need to apologize for writing up good ideas.

J.Pietschmann



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by Jeremias Maerki <de...@greenmail.ch>.
On 08.05.2003 23:10:42 Glen Mazza wrote:
> Jeremias,
> 
> Thanks for the explanation--sorry for my ignorance--I
> did not realize XSLT/TrAXInputHandler were exposed to
> the user for embedded programming.  I thought these
> two classes were for internal use only, for generating
> XSL FO when XML & XSL are provided on the
> command-line.  (For embedded coding, I've always used
> JAXP to get the XSL FO InputSource, and only then used
> the FOP Driver code.)  

...which is a Good Thing (tm).

> I now see the concerns about quickly deprecating the
> classes.  "In a perfect world", though, it would be
> nice if all users were using JAXP--this API is pretty
> elegant/succinct as-is, and it would be good for
> everyone to become skilled with it.  Direct JAXP use
> is what the Xalan team, in their FAQ, appears to be
> promoting as well.

That's why I wrote the embedding examples (Example??2??.java) a few
months ago. You see, the API still comes from pre-JAXP times which
explains some of this. Maybe we should really just change the embedding
page to JAXP usage.

Does anyone disagree?



Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by Jeremias Maerki <de...@greenmail.ch>.
On 03.05.2003 21:18:19 Glen Mazza wrote:
> 
> --- "J.Pietschmann" <j3...@yahoo.de> wrote:
> > Glen Mazza wrote:
> > > I got them combined earlier--the patch (attached)
> > to
> > > XSLTInputHandler seems to work fine in getting rid
> > of
> > > TRAXInputHandler.java.
> > 
> > The problem is all the sample code and user programs
> > mentioning XSLTInputhandler and TRAXInputHandler out
> > there.
> 
> Usually, they're referencing just Fop or Driver
> classes so they would be hidden from these changes.

No. Just look at http://xml.apache.org/fop/embedding.html

The Fop class should not be used from a Java program. It's only used as
a starting point for the command-line. As soon as you use driver you may
also have to use XSLT/TraxInputHandler if you don't use JAXP and
getContentHandler().

> These are internal functions--so the user can enter an
> .XML and an .XSL instead of an .FO on the command
> line.  I don't believe users accessed these classes
> directly programmatically--it would be very difficult
> and cumbersome to do so, especially for
> TRAXInputHandler.  
> 
> Still, the nature of the code (XSLTInputHandler
> delegating to TRAXInputHandler upon finding a
> TRAX-compatible XSLT processor) should not require any
> changes to code that normally references
> XSLTInputHandler.  

But once you got code out you need to support it. We've done the mistake
of ignoring backwards-compatibility too many times in the past and got
barked at. And they were right. As it looks the stuff (Driver and
friends) in the apps package will eventually get deprecated. We might
even need to revert some of the changes done to sync with the
maintenance branch. Then we can concentrate on a good API in another
package.

Even now, HEAD FOP doesn't run in Cocoon without patching the
fop-cocoon-block. We need to fix that.

> > We should deprecate one (or both) for a transition
> > period
> > before removing them. I thought I did this???
> > 
> 
> I'm recommending this for DR1.0--not for the
> maintenance release.

Sure. We're not talking about the maintenance branch here.

> By the time DR1.0 is ready that
> will more than constitute the transition period.

Not necessarily. That's when migration will start! We need to provide a
good migration path.

> XalanJ-1 (support for which was the original reason
> for having both classes in FOP) has already been
> transitioned off and discontinued from the Xalan web
> page--you can't even download it anymore.  Both Saxon
> and XalanJ-2 are TRAX-compatible, so starting with
> DR1.0 is an excellent time to get rid of the XalanJ-1
> dinosaur.

That's true. But that doesn't mean we can just throw
XSLT/TraxInputHandler out of the window.

> I'd recommend applying the patch to
> XSLTInputHandler--it seems to work well and you can
> automatically delete the TRAXInputHandler class as a
> result.  For later Avalonization of this particular
> class, it can just be applied to this one file
> instead.

I am -1 on removing TraxInputHandler for now, therefore I'm -0 on
applying your patch. Sorry. I hope you see that we're going in another
direction:
- We are discussing a totally new and intuitive API.
- We need to preserve backwards-compatibility for some time, even if it
  hurts a bit.

Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by Glen Mazza <gl...@yahoo.com>.
--- "J.Pietschmann" <j3...@yahoo.de> wrote:
> Glen Mazza wrote:
> > I got them combined earlier--the patch (attached)
> to
> > XSLTInputHandler seems to work fine in getting rid
> of
> > TRAXInputHandler.java.
> 
> The problem is all the sample code and user programs
> mentioning XSLTInputhandler and TRAXInputHandler out
> there.

Usually, they're referencing just Fop or Driver
classes so they would be hidden from these changes.

These are internal functions--so the user can enter an
.XML and an .XSL instead of an .FO on the command
line.  I don't believe users accessed these classes
directly programmatically--it would be very difficult
and cumbersome to do so, especially for
TRAXInputHandler.  

Still, the nature of the code (XSLTInputHandler
delegating to TRAXInputHandler upon finding a
TRAX-compatible XSLT processor) should not require any
changes to code that normally references
XSLTInputHandler.  


> We should deprecate one (or both) for a transition
> period
> before removing them. I thought I did this???
> 

I'm recommending this for DR1.0--not for the
maintenance release.  By the time DR1.0 is ready that
will more than constitute the transition period.

XalanJ-1 (support for which was the original reason
for having both classes in FOP) has already been
transitioned off and discontinued from the Xalan web
page--you can't even download it anymore.  Both Saxon
and XalanJ-2 are TRAX-compatible, so starting with
DR1.0 is an excellent time to get rid of the XalanJ-1
dinosaur.

I'd recommend applying the patch to
XSLTInputHandler--it seems to work well and you can
automatically delete the TRAXInputHandler class as a
result.  For later Avalonization of this particular
class, it can just be applied to this one file
instead.

Glen


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Glen Mazza wrote:
> I got them combined earlier--the patch (attached) to
> XSLTInputHandler seems to work fine in getting rid of
> TRAXInputHandler.java.

The problem is all the sample code and user programs
mentioning XSLTInputhandler and TRAXInputHandler out there.
We should deprecate one (or both) for a transition period
before removing them. I thought I did this???

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by Glen Mazza <gl...@yahoo.com>.
I got them combined earlier--the patch (attached) to
XSLTInputHandler seems to work fine in getting rid of
TRAXInputHandler.java.

(Well, the projectteam.xml/.xsl ->.pdf example works,
but the glossary.xsl/.xml example has the same
InlineStackingLayoutManager infinite
loop/OutOfMemoryError on DEV1.0 that the present
two-class code has.)

Avalonized versions or not, I'm looking forward to
having the classes merged--the presence of both is
confusing, it makes it hard to follow the code.

Glen


--- Jeremias Maerki <de...@greenmail.ch> wrote:
> I'd prefer if we called it "new API", not
> "avalonized API". Because we
> may end up with two different APIs.
> 
> On 28.04.2003 18:33:59 J.Pietschmann wrote:
> > In the avalonized API, both classes will be
> deprecated.
> 
> 
> Jeremias Maerki
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> fop-dev-unsubscribe@xml.apache.org
> For additional commands, email:
> fop-dev-help@xml.apache.org
> 


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by Jeremias Maerki <de...@greenmail.ch>.
I'd prefer if we called it "new API", not "avalonized API". Because we
may end up with two different APIs.

On 28.04.2003 18:33:59 J.Pietschmann wrote:
> In the avalonized API, both classes will be deprecated.


Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Merge TRAXInputHandler into XSLTInputHandler?

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Glen Mazza wrote:
> I was wondering if TRAXInputHandler can be merged into
> XSLTInputHandler, and the former class removed from
> use.  I'd like to try to submit a patch on this, but I
> may be missing something about why we need both
> classes.

In the avalonized API, both classes will be deprecated.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Merge TRAXInputHandler into XSLTInputHandler?

Posted by Glen Mazza <gl...@yahoo.com>.
I was wondering if TRAXInputHandler can be merged into
XSLTInputHandler, and the former class removed from
use.  I'd like to try to submit a patch on this, but I
may be missing something about why we need both
classes.

The classes appear to have been separated in March
2001 to support users who were still working with the
non-JAXP compliant Xalan-1J transformer. 
(http://marc.theaimsgroup.com/?l=fop-dev&m=98329165432269&w=2)
But Xalan-1J is no longer supported or even available
from the Xalan homepage, and both SAXON and Xalan-2J
are JAXP-compliant.

Do we plan on continuing support for Xalan-1J for FOP
1.0?  If not, we may be OK with one class.

Glen


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: How layout works

Posted by Victor Mote <vi...@outfitr.com>.
Glen Mazza wrote:

> > I have started (in an article originally written by
> > Arved) a project of
> > documenting in side-by-side columns (for the old and
> > new code) some of the
> > key control points in the code:
> > http://xml.apache.org/fop/dev/implement.html
>
> You may wish to place this "walk thru" page in the
> design section, maybe in the "About" section.  This
> keeps all the "how it works" information on the design
> page.

I'm going to defer doing any of this until I get done with the rework on the
design doc. The original idea was for design to be abstract, and dev to be
concrete. However, what seems to be working better is to combine the two
into one page, basically issues at the top, then the implementation details
at the bottom. Dev then becomes tools, procedures, etc.

> The walk-thru page can eventually evolve into a list
> of the major/primary classes that Fop uses, in order
> of access, perhaps for a sample generation of
> simple.pdf from simple.fo.

Yes, that is exactly the idea.

> [One more suggestion for the Design section,
> "Compliance" is kind of confusing as a menu title for
> the FOP detailed description section:  Maybe "FOP
> Process", "Process Flow", "How FOP Works", etc., is
> clearer.]

I agree 100%. Kind of like the guy who kicked the cactus -- it seemed like
the thing to do at the time. That has already been changed in CVS. If you
want to see the current (unpublished) status, go to:
 http://forrestbot.cocoondev.org/sites/xml-fop/design

Victor Mote


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: How layout works

Posted by Glen Mazza <gl...@yahoo.com>.
--- Victor Mote <vi...@outfitr.com> wrote:
> J.Pietschmann wrote:
> 
> > it might be interesting to write up again how the
> layout core works
> > in the old and in the new code.
> 
> We don't have a place to document the
> maintenance branch
> layout design, but I'll create one if you think it
> will help.

If the primary goal is to educate the newbies--I
wouldn't emphasize it too much.  I take it as a given
from the committers that it was not the best design,
and just concentrate on learning the current trunk
code instead.

> 
> I have started (in an article originally written by
> Arved) a project of
> documenting in side-by-side columns (for the old and
> new code) some of the
> key control points in the code:
> http://xml.apache.org/fop/dev/implement.html

You may wish to place this "walk thru" page in the
design section, maybe in the "About" section.  This
keeps all the "how it works" information on the design
page.

The walk-thru page can eventually evolve into a list
of the major/primary classes that Fop uses, in order
of access, perhaps for a sample generation of
simple.pdf from simple.fo.

[One more suggestion for the Design section,
"Compliance" is kind of confusing as a menu title for
the FOP detailed description section:  Maybe "FOP
Process", "Process Flow", "How FOP Works", etc., is
clearer.]

Glen

__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: How layout works

Posted by Victor Mote <vi...@outfitr.com>.
J.Pietschmann wrote:

> it might be interesting to write up again how the layout core works
> in the old and in the new code.

I'm actually pretty deep into sorting out & restructuring the design
documentation for the new code. Thank you, thank you, thank you for the
write up here. I will weave it in to the existing doc, hopefully over the
next two days. We don't have a place to document the maintenance branch
layout design, but I'll create one if you think it will help.

I have started (in an article originally written by Arved) a project of
documenting in side-by-side columns (for the old and new code) some of the
key control points in the code:
http://xml.apache.org/fop/dev/implement.html
This was started in an attempt to help developers quickly see what class
they might want to put a breakpoint in to see what is going on. As you can
see, it hasn't ventured into the Layout section yet :-). Please feel free to
add to it.

> My approach would be
> 1. get rid of all the iterators and move the functionality to the
>     layout managers because layout managers currently store many
>     of the same info anyway.
> 2. get rid of position objects and fold their data into break possiblities
> 3. make a proper hierarchy of break possibility objects
> 4. move stacking info to the layout context and make proper snapshots.
> 5. eventually get rid of the layout managers which correspond 1-to-1
>     to FOs and move the functionality back to the FOs.
> This is however not a small task, and I'm reluctant to start
> partly because
> inquiries to Keiron and Karen for rationales of the current design didn't
> result in anything, not even a statement like "It looked good"
> Also IDs, markers and out-of-line areas will need some more attention,
> the mapping from IDs to pages/areas for example should be stored in either
> athe next break possiblity or a layout context snapshot.
>
> Comments?

I'm not totally up to speed here yet, but here are my initial thoughts:
The first three all sound right. On #4, I may not be understanding what you
are saying, but I am not sure that we need snapshots at all. More on this
below.

On #5, I have just recently decided that I liked the idea of the layout
managers totally separate from the FO Tree (after originally not liking it).
I'd like for layout to be pluggable, and keeping the LMs separate would seem
to facilitate that. Don't like FOP's layout? Drop in your own. I really
would like for the interface between FO & LM to be merely the passing of the
page-sequence FO to a page-sequence LM (which I don't think currently
exists). (It is pretty close to that now -- for some reason the Area Tree is
created first & passed with the FO). Actually, maybe what should happen is
that control goes back to the RenderContext, which either fires up a
StructureRenderer or a page-sequence LM. Once the page-sequence FO tree is
created & "refined", the only processing that the FO classes should do is
tree traversal, based on requests from the page-sequence LM ("Send me more
data if there is any!"). I think Peter has done some work on making the FO
Tree a first-class tree. FO Tree & Area Tree are data-oriented classes, LMs
are process-oriented. For saving memory, I am intrigued with the idea of
having the Area Tree keep track only of its parent FO, and pointers into its
content, i.e. an offset & size (text not existing in Area Tree, only
pointers to text in FO tree). The FO tree is the model, the Area Tree the
view, and the LMs the controller. Sometimes when I say things like this, I
find out later on that it is already so. If so, I apologize. I will readily
admit that I am still confused by a lot of what I see when stepping through
the code.

I read somewhere that Peter Karow (I think) was trying to extrapolate TeX's
paragraph-oriented line-breaking algorithm to a scheme that would optimize
layout for a chapter (page-sequence to us). Until talked out of it, I have
tentatively adopted this as my opinion of the "right" approach. Knuth has
laid out the theory, algorithms, and implementation in such detail on the
line-breaking logic, that it seems (to my optimistic mind) for the other to
be in reach as well. One possible result is that we may only need two LMs,
one for positioning text within rectangles and one for stacking rectangles
on pages. The first is basically the TeX paragraph-oriented line-breaking
scheme. The second would be the extrapolation to the page-sequence level
(which I think might do away with the need for snapshots).

I think this concept ties in with the Break Possibility idea. However, our
design doc still says that we are trying to send pages to the Renderer ASAP.
I think eventually this needs to be configurable, so that people can pick a
spot on the speed vs. quality continuum, and on the size-of-document vs.
speed continuum (i.e. a big document that needs high quality will have a
higher probability of needing to be serialized along the way).

My efforts over the past month have been oriented toward getting our doc
normalized & reorganized. The result will, I think, be smaller than the
preexisting doc, but without losing any content. We can then add to it as we
resolve some of these issues. Rhett suggested using a wiki, which is fine
for the discussion phase, but we eventually want to finalize the wiki & move
it into our design doc as issues are resolved. (We have a couple of wiki now
that need to be retired.)

I am somewhat reenergized by Joerg's work here. I am interested to know how
much interest there is from potential developers who follow this list in
getting more involved with coding, if we can get the layout design
documented & resolved better. I am thinking specifically of Rhett and Glenn,
but there may be others as well. Please speak up, tell us if you are
interested, and what you need. I think we would love to turn some serious
contributors into committers. I am glad to see Arved getting active again.
If we can get Peter's work rolled into the trunk, perhaps he could jump into
layout as well (??). We might have 6.05 to 8.05 developers here ready to
help Keiron.

My apologies for the long post.

Victor Mote


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: How layout works

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Glen Mazza wrote:
> I don't know much (read: anything) about the Area Tree
> creation process.  But your step #5 sounds close to
> your description of what the maintenance branch code
> already does.  
> 
> Are you suggesting that it might be better to move
> back to the present layout functionality in the
> maintenance branch code, and just fixing the
> problems/adding the enhancements you mentioned to it
> directly?

No, there are still too much differences, in particular global
state (IDs) and out-of-line-area handling has to change
considerably. Apart from this, interfaces to the renderers
and to the font subsystem changed quite a bit.
I was talking about an evolution path for the redesigned
code into something which is easier to grok and makes it
much easier to implement constrained layout than the
maintenance code does.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: How layout works

Posted by Glen Mazza <gl...@yahoo.com>.
I don't know much (read: anything) about the Area Tree
creation process.  But your step #5 sounds close to
your description of what the maintenance branch code
already does.  

Are you suggesting that it might be better to move
back to the present layout functionality in the
maintenance branch code, and just fixing the
problems/adding the enhancements you mentioned to it
directly?

Glen

--- "J.Pietschmann" <j3...@yahoo.de> wrote:
> Hi all,
> it might be interesting to write up again how the
> layout core works
> in the old and in the new code.
> 

> In the old code, most of the layout functionality is
> concentrated
> in the layout() methods of the FOs. Also most of the
> necessary state
> is concentrated as fields in the FOs. 

> 5. eventually get rid of the layout managers which
> correspond 1-to-1
>     to FOs and move the functionality back to the
> FOs.


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org