You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-commits@xmlgraphics.apache.org by Apache Wiki <wi...@apache.org> on 2006/08/04 12:46:02 UTC

[Xmlgraphics-fop Wiki] Trivial Update of "GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats" by VincentHennebert

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Xmlgraphics-fop Wiki" for change notification.

The following page has been changed by VincentHennebert:
http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats

The comment on the change is:
Splitting: 2- Implementing Before-floats

New page:
#pragma section-numbers on

'''Contents'''
[[TableOfContents]]


== Characteristics of the fo:float element ==
This section contains a summary of the part of the spec dealing with floats.

||'''Nb of generated areas'''||'''Area class'''||'''Notes'''||
||<|2>0 or 1 ||<|2(>xsl-anchor||inline area of dimension 0 if possible, block area otherwise||
||only if the value of the "float" property is not "none"||
||<|6> 1 or more of||<|4>xsl-before-float||must be a descendant of a flow object assigned to a region-body||
||may not be a descendant of an absolutely-positioned block-container||
||must appear on the same or a following page||
||may be broken on several pages only if it can't fit on a page alone (without any other float, footnote, or normal content)||
||xsl-side-float||generates reference areas||
||xsl-normal|| ||

 * Validity checks: an fo:float may not have an fo:float, fo:footnote or fo:marker as a descendant. There are several objects which have such constraints (fo:title, fo:footnote...) but AFAICT the checks for those constraints are not implemented. I'll leave it as is for now, as it is not critical. A general solution will have to be found when implementing such checks.

== Factorizing out the Handling of Footnotes and Floats ==
I see only two differences between before-floats and footnotes:
 * footnotes must appear on the same page as their citation, unless there really is no possible pagination which achieve that. Figures may appear on later pages.
 * footnotes may be split so that a part be placed on the following page. Figures may not be split.

Those two differences excepted, the handling is the same. So layoutmgr.!PageBreakingAlgorithm could be adapted to no longer handle a list of footnotes, but two (or more) lists of floats; the float machinery could be extracted from !PageBreakingAlgorithm and put in a special parameterized class. In fact the two parameters could just be penalties for deferring and splitting:
 ||'''Kind of float'''||'''Defer penalty'''||'''Split penalty'''||
 ||footnote||almost infinite||very much||
 ||before-float||much||infinite||
(Actually a before-float may be split, but only in the degenerated case where it does not fit alone on a whole page.)

Other possibility: only one parameter defer penalty, and an overriden getFloatSplit method, which would contain the code of the current getFootnoteSplit method for footnotes, and just return 0 for before-floats.

=== Changes on the LayoutManager Architecture ===
When the "float" property is "none", the float must be handled as a normal block; no anchor area is generated. To handle this case I've chosen to directly create a !FloatBodyLayoutManager which will render the float in the flow of elements. Otherwise I mimic the behaviour of footnotes: a !FloatLayoutManager is created which will insert an anchor in the list of Knuth elements; the corresponding float blocks will be handled by !FloatBodyLayoutManager. This is done in !LayoutManagerMapping.!FloatLayoutManagerMaker, where the value of the "float" property for the corresponding Float node is consulted before creating the !LayoutManager.

There are probably things to factorize out between the two !LayoutManagers; the {{{addAreas}}} method is for example the same. It may make sense to create an abstract !OutOfLineLayoutManager super-class. However, the {{{addAreas}}} method seems to never be called, so it may perhaps be removed and it would become useless to have a common super-class. That's a thing I must find out, this is on my TODO-list.

=== A Special Class for out-of-line Objects ===
First, there are many classes in the layoutmgr package which are related to the Knuth breaking algorithm. As the layoutmgr package already contains a lot of classes, it may make sense to create a new subpackage for the breaking algorithm. That's what I did in the patch, and if this is agreed I'll move the other classes in this subpackage in my next patch.

The !OutOfLineRecord class is meant to contain all the logic related to the handling of out-of-line objects:
 * storing progress informations during the breaking process: how many out-of-lines have already been encountered, how many have already been placed, was the last placed object split, etc. The corresponding variables have exactly the same role as the totalWidth, totalStretch, totalShrink variables.
 * methods to manipulate out-of-line objects: register newly encountered ones, find a place where to split, etc.

The progress informations are stored in an internal class of !OutOfLineRecord; it is used for two things:
 * to record the current situation during the breaking, when a legal breakpoint is being considered;
 * when an active node is created, to record infos about the out-of-line objects inserted up to the corresponding (feasible) breakpoint.

Why an internal class?
 * as the progress informations are also used by active nodes, this is better to group them in one class rather than having several independant fields. Hence a class.
 * they are one part of the informations stored in an !OutOfLineRecord instance. The other informations are the list of Knuth sequences corresponding to out-of-line objects, the list of cumulated lengths, the size of the separator, and so on. Hence an internal class, part 1.
 * they are accessed very often by methods of !OutOfLineRecord. This is a convenient way to have access to the fields, while keeping them private for other external classes. Hence an internal class, part 2.
 * as already said, they are also used by active nodes, and not only by an !OutOfLineRecord instance. Hence a static class.

=== Other Changes ===
They mostly consist of copy-pasting code relating to footnotes wherever they are referred to, and adapt it to floats. Examples: adding anchors for before-floats in !KnuthBlockBox, adding a !FloatLayoutManagerMaker in !LayoutManagerMapping, handling the addition of a before-float area when necessary, etc.

== Algorithm for Placing Before-Floats ==
In Fop, out-of-line objects are handled by an extension of the Knuth breaking algorithm. The handling of before-floats is a bit simpler because they can't be split on several pages like footnotes (excepted in the degenerated case where a float does not fit on one page alone).

Ideally, a footnote should be entirely placed on the same page as its citation. When this is not possible, it may be split, but as few times as possible. See the following figure to understand the issue (the line with the small red sign contains the footnote citation):

http://atvaark.dyndns.org/~vincent/footnotes.png

In the first case, the footnote is split on two pages and that's the best we can do. In the second case, there are pieces of the footnote up to 3 pages later; this would disturb the reader who would have too many pages to turn to read the footnote.

To avoid that, the algorithm prevents a footnote to be split if there is a legal breakpoint between the currently considered active node and the currently considered breakpoint, unless this is a new footnote (i.e., not already split). For example, on the preceding figure, every line corresponds to a legal breakpoint. When the line containing the footnote citation is considered for breaking the page, the new footnote may be split. When the following line is considered, there are already many legal breakpoints between the breakpoint of the previous page and that one, so the footnote is not allowed to be split. So the algorithm tries to put the entire footnote on the page, which does not work as it is too big. Thus the breakpoint is discarded (this is not a ''feasible'' breakpoint), and same for the following lines.
For the first page, the best breakpoint then corresponds to the line with the footnote citation, this allows to put as much of the footnote as possible on this page.

On the second page, no break will be permitted if it splits the footnote, for the same reason as before. Thus the best breakpoint will be the one which puts as many normal lines as possible on the page, plus the entire remaining piece of footnote.

As before-floats may not be split, their handling is simpler than for footnotes. Actually we may use the same algorithm, but this will force the float to be on the same page as its citation, which may give underfull pages as on the following figure:

http://atvaark.dyndns.org/~vincent/floats-underfull.png

It would be better to put the citation on the first page together with some other lines and defer the float on the second page.

In fact, just playing with increased demerits for breakpoints with deferred floats is sufficient to have a reasonable amount of floats on the same page as their citations, while preventing underfull pages from being created.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-commits-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-commits-help@xmlgraphics.apache.org