You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-commits@xmlgraphics.apache.org by Apache Wiki <wi...@apache.org> on 2006/09/26 21:19:43 UTC

[Xmlgraphics-fop Wiki] Update of "PageLayout/BestFitPageBreaking" by SimonPepping

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Xmlgraphics-fop Wiki" for change notification.

The following page has been changed by SimonPepping:
http://wiki.apache.org/xmlgraphics-fop/PageLayout/BestFitPageBreaking

The comment on the change is:
A note on best fit page breaking

New page:
= Best fit page breaking =

The total file line and pagebreaking algorithm by Donald Knuth is
easily perceived as a monolith. Take it or leave it. Vincent Hennebert
suggested a way to open the algorithm up for best fit page layout. One
only collects feasible page breakpoints for the next page. When all
are collected (the content between the only active node and the
current breakpoint is too large to squeeze on the page, and the active
node is deactivated), one determines the best pagebreak. Then one
resets the algorithm to the chosen page breakpoint, which is the only
active node for the layout calculation of the next page.

Why is a best fit page breaking algorithm required? The total fit
algorithm does not easily allow one to use different inline
progression dimensions for different pages. Recently some ideas to do
this in a total fit approach have been discussed, see the page
PageAndLineBreaking. But these ideas are rather complicated and far
from practical implementation.

Why do we not write a simple best fit page breaking algorithm? Because
we do not want to discard the total fit algorithm. Ideally the best
fit and total fit algorithms coexist in FOP's code base. And they have
a common design so that they share much code.

In FOP page and line breaking is done as follows. The PageSequenceLM
starts the collection of page breakpoints by invoking its method
!GetNextKnuthElements. This is repeated recursively for all descendant
LMs. Each LM returns to its parent LM the elements it and its subtree
have collected. The BlockLevelLMs return block elements. The LineLMs
collect the inline and block elements from their descendant
InlineLevelLMs and BlockLevelLMs. For sequences of inline elements,
i.e. paragraphs, the line breaker caculates the best line breaks in
Knuth's total fit algorithm. Lines and block elements are then
returned to the parent.

The collection process is finished when this method has returned to
the PageSequenceLM once or multiple times, and the PSLM is
finished. Then the page breaker kicks in and calculates the best page
breaks in Knuth's total fit algorithm.

To realize the idea of best fit page breaking the page breaker must be
able to interrupt the element collection process more often. It can
then evaluate the collected block elements and see if the best fit for
the next page break can already be determined. This requires a
communication channel between the element collection process, which
may have deeply descended the tree of LMs, and the page breaker.

This communication process can be achieved as follows. The method
!GetNextKnuthElements should not return the collected elements to the
parent LM, but hand them over to the page breaker. In the current set
up the page breaker performs a loop over the legal
breakpoints. Actually this is a loop over all elements, and the
iteration ends quickly if the element is not a legal page break. In
the new situation this loop should be interruptible and
resumable. When the page breaker receives a sequence of one or more
elements, it performs the iterations over these elements. If it does
not have enough elements to determine the best pagebreak, the
collection process of the LMs continues. If it does find the best
pagebreak, and if the ipd of the next page is different, the
collection process is reset to the chose pagebreak. If the ipd of the
next page is the same, the deselected feasible page breaks are
removed, but the already collected elements can be retained. The
selected page breakpoint becomes the only active node, the existing
element list can be evaluated by the page breaker, and then the
collection process can be resumed at the page break to which it was
reset or at the point where it left off.

In this process paragraphs are always evaluated until the paragraph
end, and the results contributed to the page breaker. When the page
break is in the middle of a paragraph and the ipd of the next page is
different, its linebreaks need to be recalculated from the page
break. When the ipd of the next page is different, this procedure does
not guarantee that it finds the best layout for the paragraph. For
that purpose it should find the best linebreaks up to the last line on
the page, not the best linebreaks for the whole paragraph. But this
procedure is rather simple, and good enough.

In principle the linebreaking process need not be changed from the
current situation. But the BlockLevelLMs in inline content expect to
be able to return their elements to a breaker, which for inline
content is probably the LineLM. The order of inline and block elements
needs to be retained. Therefore, it may be required to redesign the
inline element collection process along the same lines as the
top-level block element collection process.

Best or total fit? This can be a parameter of the page breaker. In
total fit mode the pagebreaker is the recipient of the elements as
well. But it lets the collection process continue until it has the
complete page sequence. Only then does it start the page breaking
calculations.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-commits-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-commits-help@xmlgraphics.apache.org