You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-dev@xmlgraphics.apache.org by Arved Sandstrom <Ar...@chebucto.ns.ca> on 2001/03/20 03:37:37 UTC

RFC: Tentative Ideas for Improvements [Long]

Hi, all

Here is some stuff that represents a fair amount of FOP code review, spec 
review, etc etc. I'm hoping to get some feedback.

Background: what we currently do is we construct the FO tree, then we lay it 
out into the area tree, and then we render the area tree. The formatting 
starts at Root, but since we can consider page sequences to be essentially 
independent, the real control centre of formatting is PageSequence.

PageSequence chugs through the entire flow, manufacturing the appropriate 
Page as dictated by the LayoutMaster, and putting static content and flow 
content into that Page. When the Page is done it is added to the AreaTree 
(which is a vector of Pages, essentially).

A key feature of our formatting is that all FO's eventually run a loop on 
their children, and instruct them to layout themselves into some appropriate 
area. The completion status for every FO is the return value of its layout() 
method, the Status. Higher FOs act on the status as appropriate - these 
include breaks, page full, etc etc. The state of an incomplete FO is 
captured using the "marker" variable.

Problem: FOP is heavily oriented towards forward processing. Backtracking 
and modified layout conditions are difficult. It can be done but it's 
unpleasant. There is an example of recording "marker" state, and 
conditionally rolling back to it, in Flow - this is for balancing multiple 
columns in a page with a multi-column span area followed by a span area with 
one column area. But this is very hardwired.

If you look at the keeps problem, or some aspects of the footnote problem 
(see one of Keiron's posts for a discussion), or really any of the 
out-of-line FO's, backtracking and retries appear almost inevitable. Here 
are actually 3 major processing possibilities:

(1) Laying everything out, but not paginating, and then deciding on optimum 
page cut locations. This seems good at first blush but is actually very hard 
to do;

(2) Laying everything out, just as now, and then doing a second pass to 
correct page boundaries. This also seems a good idea initially, but on 
closer inspection one realizes that it requires very radical changes.

(3) Taking care of business as soon as possible. This is my preference. I 
think it can be done relatively cleanly, I think it is conceptually just as 
elegant as any of the other solutions if done well. It is basically 
"incremental multiple-pass" processing.

So I started thinking about (3), particularly in the context of keeps. 
You'll note that I ended up putting the getMarkerSnapshot() and rollback() 
methods into FONode, which was driven more by circumstances than great 
design at the time that I did it, but turns out to be quite useful. These 
are actually very generic methods, albeit not heavily tested (see the 
Memento pattern in the GoF book for ideas on improvement).

Let me discuss the column-balancing case a bit more, since I think it is 
instructive. What happens in the Flow layout() is that we know that 
column-balancing might be required. _If_ it happens, even across pages, it 
will always be back to the start of a span-area, so whenever a span-area 
starts, the state of the FO tree is recorded with getMarkerSnapshot(). If 
balancing is necessary, we do 2 things: we alter the environment (in this 
case we create a new span-area/column-area geometry), and we redo the layout.

To generalize, let's consider that we have identified a start and end point 
for a transaction in the above example (a new span area is the start, and 
a single attempt at balancing is the end point), we have identified a testing 
condition (is balancing required), and we have a mechanism for selecting 
(forcing) a different possible outcome (changing the geometry). In the most 
general case, then, we have a type of "formatting transaction", which has 
the following characteristics:

(1) a start point;
(2) an end point;
(3) one or more testing conditions, so that at the end of the transaction we 
can decide whether to proceed ("commit") or not ("rollback"); and
(4) a mechanism for changing layout conditions before we restart.

Every format() and layout() method we have is implicitly a transaction, with 
a start at the beginning of the method, and end point at the end of the 
method, and an automatic commit. Our current use of Status is a different 
mechanism entirely.

OK, where are we headed with this? Let's take a block-level FO that has a 
"keep-together", and it is not affected by any neighbours. The start point 
for it is at the beginning of its layout(); we do getMarkerSnapshot() there. 
The end point is at the end of its layout() - the condition is "are all the 
areas in one context area?" If Yes, we commit, i.e. we set the status of 
affected areas to be committed, not pending. If No, we rollback. The 
mechanism for changing the environment is to impose a "break-before=column" 
or "break-before=page", depending on the context of the keep, on the FO, and 
redo from the start point. [Note: my use of the terms "pending" in
connection with areas is not meant to suggest that the current use of this
term in some of our code is related. However, the ideas are not dissimilar.]

So far so good. What about the complicated stuff? The 3rd fo:block child of 
fo:flow has a "keep-with-next.within-column=2", the 4th has a 
"keep-together.within-page=always", and the 5th fo:block child of fo:flow 
has a "keep-together.within-page=1". Let's say that the actual keep 
strengths don't matter, here, so it's more general. Point being, the 
"transaction" is really on the 3rd, 4th and 5th flow children, not on the 
flow as a whole. Do we want the Flow to account for these situations? No.

So the provisional idea I have is to pre-process the FO tree, clean out 
conditions that can't be (like an FO with "keep-with-previous" after [in 
pre-order traversal] an FO with "break-after"), AND create "pseudo-FO's" as 
required by conditions like the above. What's a pseudo-FO? For example, in 
the situation above, the 3 FO's in question get wrapped in a Transaction 
object - they are children of that Transaction. Transaction objects extend 
from FONode (this seems best, although it's not perfect). It becomes the 
responsibility of the Transaction to handle all the book-keeping that I have 
described above.

The really interesting part is, how do we design a useful Transaction class? 
This is where the Patterns book comes in really handy, and I'm still 
swotting. Loose concepts: we only have so many atomic conditions, and 
possibly each one of those becomes a class. How the Transaction stores the 
condition instances is one question, obviously. Maybe use of Strategy?

I think that there is a good chance that we will have an improved picture on 
how to handle space-specifier resolution once this discussion is underway.
If you're keen about helping do some of the design, maybe think about keeps,
footnotes, side-floats and before-floats, space-specifier sequences in
various spots etc etc. Think about all of the possible nasty situations -
where would one designate start and end points for a "transaction"? What
is the testing condition(s)? How do you change the environment to force
a different outcome? Bear in mind, too, that we are concerned with
fo:region-body - this is where flow content is currently going.

Right now I'm doing paper-and-pencil and pseudocode. I have sort of a warm 
fuzzy about the above, but I'm not a 100% on whether it will work. My gut 
feeling though is that if it does it keeps extra logic code out of layout 
methods - I started looking at the example of 3 fo:blocks above, operating 
on the assumption that the layout() methods of each have to cooperate and 
handle situations like that, and it got very ugly very quickly. :-) That I 
think we want to avoid at all costs.

Anyhow, lots of feedback is requested. If anything is unclear please ask. 
I'm hopeful that if this idea passes review that after some initial overhead 
it will break the log-jam that I see in our code at the moment.

Thanks for your patience.

Regards,
Arved

Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.

At 02:13 PM 3/31/01 +1000, Peter B. West wrote:
>I would like to make a few general observations on this discussion,
>which are still unencumbered by any knowledge of the code, although I am
>slowly, in spite of many interruptions, becoming more familiar with the
>spec.  I am still struggling to get a handle on all of this.

Hi Peter

You may or may not follow the XML Apache general mailing list. If not, you 
won't be aware that a charter proposal is being debated, and it will almost 
certainly mandate the following from each project (among other things):

1) set of requirements;
2) design document(s)

Now, we have both of those. And both, particularly the design docs, are 
obsolete.

I've noticed your spec study over the past months, and I consider that a 
valuable resource to this project. I am sure others do also. You might call 
it struggling, but considering the rather arcane portions of the spec that 
you're looking at, I'd say that you're developing some useful knowledge that 
I'd hate to lose. Point being, we will almost certainly be doing up new 
formal designs over the next 1-2 months, and your input and assistance is 
encouraged. Contributors don't have to stick to code, after all.

If you have some architectural/high-level design ideas, and would like to 
help us in sketching out high-level design, please feel free. Pseudocode, 
any type of notation (it doesn't have to be just UML), and insightful 
commentary such as you've provided so far - all welcome.

I'm not normally this effusive, but I'm trying to flatter you to the point 
where you write the design doc and we can then rubberstamp it... :-) Just 
kidding. Anyhow, all assistance welcome.

Regards,
Arved Sandstrom

P.S. All that aside, I did read your last commentary in detail, and although 
I don't have the time to make any major observations at the moment, I agree 
with your general comments.

Note that every area does have a trait that ought to be used to associate it 
to the FO that generated it. "generated-by", I think it's called, off the 
top of my pointy head.

In reference to my "transactions" concept and "committal"; Yes, actual 
committal might never happen, per se. It could well be more of a concept 
than a concrete act, as Karen also suggested. In terms of keeping track of 
start points, absolutely - we do that to a point now, with "marker" in 
FONode, and that's what the current getMarkerSnapshot() and rollback() 
methods in FONode are all about, also. I agree that in general we would be 
keeping track of more data of this kind.

Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by "Peter B. West" <pb...@powerup.com.au>.

I would like to make a few general observations on this discussion,
which are still unencumbered by any knowledge of the code, although I am
slowly, in spite of many interruptions, becoming more familiar with the
spec.  I am still struggling to get a handle on all of this.

Thanks to Karen for her comments on a previous post.  I had no idea
whether that post had made any sense to those more familiar with the
problems.

I agree with Arved in the first post of this series that "incremental
multi-pass processing" is a good idea, but I think it might well be more
compatible with the galley idea than it at first seems.  If the
processing of flows can be set up somehow as independent threads which
block when they get ahead of the current page layout thread, even as
they obtain information about the formatting context from that thread or
some subsidiary, both models may in fact be realized.

The two main "contradicting" activities are the processing of flows and
the filling of individual pages, and the nasties seem to happen because
the page model is imposed over the top of the flows.

Karen points out that backtracking does not necessarily mean that
everything has to be redone.  Conceptually, the area tree is built up
from a number of atoms whose size is determined by their image
properties, be they character glyphs or other image blocks.  As these
atoms are merged into higher and higher level blocks, additional
positioning information attaches indirectly to them through their
association with siblings and with these enclosing entities, e.g.,
relative position in some enclosing block, or absolute position in a
viewport area.  At the same time, the dimensions of these enclosing
entities are being derived and stored.  Once the basic information about
the atoms is available, it need never be derived again.  On the other
hand, the derived dimensions of a higher-level entity may be affected by
a number of factors.

As Karen points out, while the inline-progression-dimension is constant,
the dimensions of a line-area need never be recalculated either.  I
understand that the text "atom" in FOP is a run of text, dealt with as a
whole, so this may be the only meaningful, or practicable, granularity. 
This atom, however, can be split.  In any case, it seems to me that, for
this to work, the flow objects and their children need to be represented
in such a way that the derived area objects can readily be associated
with them.

When backtracking occurs, it will presumably involve backtracking in the
FO tree, in both the flow and page/region dimensions.  If the
already-derived areas are associated with the flow objects, only the
minimum amount of layout regeneration need be done.

One thing that might come out of such an approach is that Arved's
transactions need never be committed.  The whole of the area tree is
pending until the last page number citation has been resolved.  It is
then question of keeping track of the start points of sequences of
objects that affect certain contexts - current line, current column,
current page.  I haven't had a chance to think about much of this.

Peter

Arved Sandstrom wrote:
> 
> At 10:30 PM 3/22/01 +0100, Karen Lease wrote:
> >Hi folks,
> >
> >The idea of "layout transactions" has a cool sound to it (which doesn't
> >keep it from giving Arved a warm fuzzy :-)
> >But even though it sounds cool, I think my worry is sort of like Fotis'.
> >I'd express it by saying that it feels too local, especially when any
> >kind of floatng FOs are involved.
> 
> Such an impression would be my fault for being imprecise. What I am really
> trying to say is this: we are all aware that the granularity of our control
> structures maps one-to-one to our FO's; however, we can identify a number of
> situations where a "thing" (a "layout transaction" for want of a better
> word) that maps to sets of FO's provides for a better control wrapper or
> control container.
> 
> What a transaction is (i.e. what it wraps) is determined by the problem.
> 
> I am also not expecting this mechanism to solve every problem we have. I
> hope it helps out quite a bit. My gut feeling is that something like it is a
> step in the right direction.
> 
> >Arved, in one of the examples in your first post, you wrote:
> >> The end point is at the end of its layout() - the condition is "are all
> the
> >> areas in one context area?" If Yes, we commit, i.e. we set the status of
> >> affected areas to be committed, not pending. If No, we rollback.
> >
> >What exactly does it mean for the area to be committed? I don't think it
> >can mean that we won't touch it again. Imagine that we are in two column
> >layout and that a following area contains a before-float or footnote. If
> >we put that on the page, it shortens the main flow area, and that means
> >we need to look again at some areas which may already be committed.
> 
> My picture of how this would work is that _however_ we do this kind of
> situation, we have to know exactly what our rules and algorithms are before
> we code it up, or it's a mess. If we do it in a more traditional FOP way,
> and have all sorts of FOP classes knowing about the guts of other FOP
> classes, we still have to do this kind of backtracking and re-layout. I
> looked at some of the code we have now, and decided that yours truly does
> not want to add more logic of this sort to any of our layout methods. I
> suspect nobody else does either.
> 
> A "transaction" is not a magic bullet for these situations. What it is,
> though, is a wrapper around the situation, pulling the logic for handling
> the situation out of the individual FO's, and hopefully simplifying
> implementation a good deal.
> 
> In your example above, yes, absolutely, if the before-float went on that
> page, the main-reference-area would get shortened up. And we would revisit
> FOs that have put their areas into the affected page. In other words, all of
> this, IMO, would be part of a single transaction. Or there would be two
> transactions contained in yet another.
> 
> >It's clear that the transaction handlers have to know a lot of things
> >about the layout strategy of their children. When I was thinking about
> >space-specifiers a while ago, I had the idea of a LayoutManager (or
> >several), which implemented various strategies for fitting areas on
> >pages. I think there is a kind of bubble-up effect between lower level
> >managers and higher level managers. Way back in January (of this year!)
> >during an extended keeps discussion, Peter West wrote down some
> >interesting ideas along this line, in which he suggested that some kinds
> >of layout processes on a page were going along in parallel and vying for
> >space (this is an extreme simplification of his post).
> 
> Precisely. All due credit where credit is due. I read everything that gets
> posted on this list, and although I can't specifically recall any of it,
> necessarily, months later, I can guarantee that all of it shapes my thinking
> until the whole ball of wax comes together, and I get an idea.
> 
> Yes, absolutely, I agree that there is a bubble-up effect. In a primitive
> way we have that now with FO layout() methods, and status codes. But this
> doesn't allow for backtracking or retries, and it is tied to the existing
> FO's.
> 
> Your LayoutManager presumably shares many design concepts and goals with the
> "transaction" idea I am flogging.
> 
> >This is rather how I see the situation with float placement, keeps and
> >breaks and column-balancing.
> >Each area has a certain "weight" which it can bring to bear. For
> >example, if we layout a block with a before-float in it, that float
> >"wants" to be placed on the same page as the anchor area. But it has
> >less weight than the anchor itself, ie. if laying out the float causes
> >the block to be broken over the page boundary and the anchor area is
> >thrown onto the next page, the solution is no good. There is an absolute
> >requirement that the float be on the same or a later page as its anchor.
> >So if the anchor can't be squeezed onto the page with its float, the
> >float will have to be put onto the next page. In the one-column case,
> >this is still fairly manageable. It obviously gets worse in the
> >multi-column case, since other blocks besides the one containing the
> >anchor may need to be revisited. Then we could run into things like a
> >block which has keep-together=always being broken over a column boundary
> >in order to get the float to fit. So the layout strategist has to know
> >how to backup and try again. It also has to recognize impossible
> >situations, like things which want to be kept together but which won't
> >fit on any page (the famous make pages forever problem).
> 
> I think we can capture all of these with rules. As I sort of broke things
> down 2 posts ago, I think that an atomic layout situation (something that is
> amenable to being described by a "transaction", or being handled by a
> LayoutManager) must be describable according to:
> 
> (1) start point;
> (2) end point;
> (3) condition(s)/test(s); and
> (4) what is the change mechanism for influencing a different (better) layout
> outcome the next time around? Could be area-based, FO-based, or a hybrid.
> 
> I'm currently writing down actual use cases and trying to describe all of the
> basic ones with this 4-point "transaction description". I'll add the above
> also. Then I'm going to start combining them, and see how we can act on sets
> of conditions/rules (keep strengths, the "weights" you allude to above, etc.
> etc).
> 
> >A general remark about the "rollback" idea. I would think we could often
> >avoid redoing the inline layout of a block again if we need to change
> >the break decision. If the parent reference area still has the same
> >inline-progression dimension, we should just be able to shove areas
> >(LineAreas, nested BlockAreas) from one BlockArea parent to another. If
> >we get into serious layout stategies, this kind of optimization will be
> >interesting. Of course there are cases where we can't do this. One is
> >where the page designer has been so perverse as to use page-masters with
> >different column widths in the same page-sequence. We might also get
> >this with intruding side-floats. And if we want to get really
> >sophisticated, we could do inline reflow to fix things like not being
> >able to hyphenate a word which is broken over a column or a page!
> 
> I was thinking more of the general case when I suggested that we would
> re-layout. You are of course correct - there are situations where we could
> transfer areas.
> 
> Getting into sophisticated reflow...I was sort of hinting at that as being
> one of the other 2 major strategies that we probably don't want to get into
> in a big way until we get FOP 1.0 out the door, and can re-assess and take a
> breather. What do you think?
> 
> Regards,
> Arved
> 
> Fairly Senior Software Type
> e-plicity (http://www.e-plicity.com)
> Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org

-- 
Peter B. West  pbwest@powerup.com.au  http://powerup.com.au/~pbwest
"Lord, to whom shall we go?"

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.

At 10:30 PM 3/22/01 +0100, Karen Lease wrote:
>Hi folks,
>
>The idea of "layout transactions" has a cool sound to it (which doesn't
>keep it from giving Arved a warm fuzzy :-)
>But even though it sounds cool, I think my worry is sort of like Fotis'.
>I'd express it by saying that it feels too local, especially when any
>kind of floatng FOs are involved.

Such an impression would be my fault for being imprecise. What I am really 
trying to say is this: we are all aware that the granularity of our control 
structures maps one-to-one to our FO's; however, we can identify a number of 
situations where a "thing" (a "layout transaction" for want of a better 
word) that maps to sets of FO's provides for a better control wrapper or 
control container.

What a transaction is (i.e. what it wraps) is determined by the problem.

I am also not expecting this mechanism to solve every problem we have. I 
hope it helps out quite a bit. My gut feeling is that something like it is a 
step in the right direction.

>Arved, in one of the examples in your first post, you wrote:
>> The end point is at the end of its layout() - the condition is "are all
the 
>> areas in one context area?" If Yes, we commit, i.e. we set the status of 
>> affected areas to be committed, not pending. If No, we rollback.
>
>What exactly does it mean for the area to be committed? I don't think it
>can mean that we won't touch it again. Imagine that we are in two column
>layout and that a following area contains a before-float or footnote. If
>we put that on the page, it shortens the main flow area, and that means
>we need to look again at some areas which may already be committed.

My picture of how this would work is that _however_ we do this kind of 
situation, we have to know exactly what our rules and algorithms are before 
we code it up, or it's a mess. If we do it in a more traditional FOP way, 
and have all sorts of FOP classes knowing about the guts of other FOP 
classes, we still have to do this kind of backtracking and re-layout. I 
looked at some of the code we have now, and decided that yours truly does 
not want to add more logic of this sort to any of our layout methods. I 
suspect nobody else does either.

A "transaction" is not a magic bullet for these situations. What it is, 
though, is a wrapper around the situation, pulling the logic for handling 
the situation out of the individual FO's, and hopefully simplifying 
implementation a good deal.

In your example above, yes, absolutely, if the before-float went on that 
page, the main-reference-area would get shortened up. And we would revisit 
FOs that have put their areas into the affected page. In other words, all of 
this, IMO, would be part of a single transaction. Or there would be two 
transactions contained in yet another.

>It's clear that the transaction handlers have to know a lot of things
>about the layout strategy of their children. When I was thinking about
>space-specifiers a while ago, I had the idea of a LayoutManager (or
>several), which implemented various strategies for fitting areas on
>pages. I think there is a kind of bubble-up effect between lower level
>managers and higher level managers. Way back in January (of this year!)
>during an extended keeps discussion, Peter West wrote down some
>interesting ideas along this line, in which he suggested that some kinds
>of layout processes on a page were going along in parallel and vying for
>space (this is an extreme simplification of his post).

Precisely. All due credit where credit is due. I read everything that gets 
posted on this list, and although I can't specifically recall any of it, 
necessarily, months later, I can guarantee that all of it shapes my thinking 
until the whole ball of wax comes together, and I get an idea.

Yes, absolutely, I agree that there is a bubble-up effect. In a primitive 
way we have that now with FO layout() methods, and status codes. But this 
doesn't allow for backtracking or retries, and it is tied to the existing
FO's.

Your LayoutManager presumably shares many design concepts and goals with the 
"transaction" idea I am flogging.

>This is rather how I see the situation with float placement, keeps and
>breaks and column-balancing.
>Each area has a certain "weight" which it can bring to bear. For
>example, if we layout a block with a before-float in it, that float
>"wants" to be placed on the same page as the anchor area. But it has
>less weight than the anchor itself, ie. if laying out the float causes
>the block to be broken over the page boundary and the anchor area is
>thrown onto the next page, the solution is no good. There is an absolute
>requirement that the float be on the same or a later page as its anchor.
>So if the anchor can't be squeezed onto the page with its float, the
>float will have to be put onto the next page. In the one-column case,
>this is still fairly manageable. It obviously gets worse in the
>multi-column case, since other blocks besides the one containing the
>anchor may need to be revisited. Then we could run into things like a
>block which has keep-together=always being broken over a column boundary
>in order to get the float to fit. So the layout strategist has to know
>how to backup and try again. It also has to recognize impossible
>situations, like things which want to be kept together but which won't
>fit on any page (the famous make pages forever problem).

I think we can capture all of these with rules. As I sort of broke things 
down 2 posts ago, I think that an atomic layout situation (something that is 
amenable to being described by a "transaction", or being handled by a 
LayoutManager) must be describable according to:

(1) start point;
(2) end point;
(3) condition(s)/test(s); and
(4) what is the change mechanism for influencing a different (better) layout 
outcome the next time around? Could be area-based, FO-based, or a hybrid.

I'm currently writing down actual use cases and trying to describe all of the
basic ones with this 4-point "transaction description". I'll add the above 
also. Then I'm going to start combining them, and see how we can act on sets 
of conditions/rules (keep strengths, the "weights" you allude to above, etc. 
etc).

>A general remark about the "rollback" idea. I would think we could often
>avoid redoing the inline layout of a block again if we need to change
>the break decision. If the parent reference area still has the same
>inline-progression dimension, we should just be able to shove areas
>(LineAreas, nested BlockAreas) from one BlockArea parent to another. If
>we get into serious layout stategies, this kind of optimization will be
>interesting. Of course there are cases where we can't do this. One is
>where the page designer has been so perverse as to use page-masters with
>different column widths in the same page-sequence. We might also get
>this with intruding side-floats. And if we want to get really
>sophisticated, we could do inline reflow to fix things like not being
>able to hyphenate a word which is broken over a column or a page!

I was thinking more of the general case when I suggested that we would 
re-layout. You are of course correct - there are situations where we could 
transfer areas.

Getting into sophisticated reflow...I was sort of hinting at that as being 
one of the other 2 major strategies that we probably don't want to get into 
in a big way until we get FOP 1.0 out the door, and can re-assess and take a 
breather. What do you think?

Regards,
Arved

Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by Karen Lease <kl...@club-internet.fr>.

Hi folks,

The idea of "layout transactions" has a cool sound to it (which doesn't
keep it from giving Arved a warm fuzzy :-)
But even though it sounds cool, I think my worry is sort of like Fotis'.
I'd express it by saying that it feels too local, especially when any
kind of floatng FOs are involved.

Arved, in one of the examples in your first post, you wrote:
> The end point is at the end of its layout() - the condition is "are all the 
> areas in one context area?" If Yes, we commit, i.e. we set the status of 
> affected areas to be committed, not pending. If No, we rollback.

What exactly does it mean for the area to be committed? I don't think it
can mean that we won't touch it again. Imagine that we are in two column
layout and that a following area contains a before-float or footnote. If
we put that on the page, it shortens the main flow area, and that means
we need to look again at some areas which may already be committed.

It's clear that the transaction handlers have to know a lot of things
about the layout strategy of their children. When I was thinking about
space-specifiers a while ago, I had the idea of a LayoutManager (or
several), which implemented various strategies for fitting areas on
pages. I think there is a kind of bubble-up effect between lower level
managers and higher level managers. Way back in January (of this year!)
during an extended keeps discussion, Peter West wrote down some
interesting ideas along this line, in which he suggested that some kinds
of layout processes on a page were going along in parallel and vying for
space (this is an extreme simplification of his post).

This is rather how I see the situation with float placement, keeps and
breaks and column-balancing.
Each area has a certain "weight" which it can bring to bear. For
example, if we layout a block with a before-float in it, that float
"wants" to be placed on the same page as the anchor area. But it has
less weight than the anchor itself, ie. if laying out the float causes
the block to be broken over the page boundary and the anchor area is
thrown onto the next page, the solution is no good. There is an absolute
requirement that the float be on the same or a later page as its anchor.
So if the anchor can't be squeezed onto the page with its float, the
float will have to be put onto the next page. In the one-column case,
this is still fairly manageable. It obviously gets worse in the
multi-column case, since other blocks besides the one containing the
anchor may need to be revisited. Then we could run into things like a
block which has keep-together=always being broken over a column boundary
in order to get the float to fit. So the layout strategist has to know
how to backup and try again. It also has to recognize impossible
situations, like things which want to be kept together but which won't
fit on any page (the famous make pages forever problem).

A general remark about the "rollback" idea. I would think we could often
avoid redoing the inline layout of a block again if we need to change
the break decision. If the parent reference area still has the same
inline-progression dimension, we should just be able to shove areas
(LineAreas, nested BlockAreas) from one BlockArea parent to another. If
we get into serious layout stategies, this kind of optimization will be
interesting. Of course there are cases where we can't do this. One is
where the page designer has been so perverse as to use page-masters with
different column widths in the same page-sequence. We might also get
this with intruding side-floats. And if we want to get really
sophisticated, we could do inline reflow to fix things like not being
able to hyphenate a word which is broken over a column or a page!

Best regards,
Karen Lease

Arved Sandstrom wrote:
> 
> At 10:14 AM 3/20/01 +0100, Fotis Jannidis wrote:
> >
> >Don't we need a second pass for some informations anyway or at
> >least some book-keeping for them (p.e. fo:retrieve-marker)? But is
> >the idea to preprocess the FO tree (in portions of page-sequences)
> >heading in that direction anyway?
> 
> Yes, I don't think an explicit second pass is ruled out by the discussion
> here. I don't think fo:marker and fo:retrieve-marker cause us major problems
> at this stage, not for this kind of layout problem, because the content goes
> into fo:static-content, not fo:flow.
> 
> >How do you define the start and end point of areas which should
> >come under one transaction handler? [Rereading your letter, I saw
> >that you asked the question yourself] I am thinking of the balancing
> >of footnote text over two pages where the number of affected blocks
> >changes with different layouts, that is in the second attempt more
> >blocks could be affected? If you have footnotes on every page - as
> >is usual in the texts I have to handle - then you would have to
> >handle overlapping transaction subtrees, don't you?
> 
> Nested transactions are going to happen; we need to decide if overlaps make
> sense and how to handle them - do they get combined?
> 
> My gut feeling is that the areas covered by the transaction handler are
> those areas generated by the FO's that are children of the transaction. To
> use your example concerning footnotes, I don't think that more blocks are
> affected, per se, not in the sense we both mean.
> 
> In my 2 examples, in one the transaction changed FO conditions (imposed a
> "break-before") and in the other the transaction changed the layout
> environment (shortens the span area for column balancing). If we choose to
> allow footnote body placement into following pages, I think we cover that by
> changing the layout environment - i.e. adjusting the footnote-reference-area
> sizes. So the other fo:block's on that subsequent page are _indirectly_
> affected - they have less main-reference-area to lay out into - but they are
> not directly included in the transaction.
> 
> I think it's OK, even desirable, for the transaction (handlers) to possess
> knowledge of the layout strategy of their FO children. E.g. a transaction
> handler for a complicated footnote situation knows what our strategy is for
> placing footnote body content, and this is how it knows to influence the
> layout geometry to cause an optimum result. A concrete implementation of a
> Transaction _should_ know about stuff like this - it is collecting that
> knowledge so as to keep the code of its children as clean as possible.
> 
> We need to do up some design notes, I think, that detail our layout
> strategies for various situations. I can document the column-balancing one,
> and also keeps. I think it is best to have this fixed on paper before we code.
> In the case of footnotes, we need to anticipate all possible situations
> and write down the solutions, first, I believe, otherwise we're in
> trouble. :-)
> 
> Regards,
> Arved
> Fairly Senior Software Type
> e-plicity (http://www.e-plicity.com)
> Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.

At 10:14 AM 3/20/01 +0100, Fotis Jannidis wrote:
>
>Don't we need a second pass for some informations anyway or at 
>least some book-keeping for them (p.e. fo:retrieve-marker)? But is 
>the idea to preprocess the FO tree (in portions of page-sequences) 
>heading in that direction anyway? 

Yes, I don't think an explicit second pass is ruled out by the discussion 
here. I don't think fo:marker and fo:retrieve-marker cause us major problems 
at this stage, not for this kind of layout problem, because the content goes 
into fo:static-content, not fo:flow.

>How do you define the start and end point of areas which should 
>come under one transaction handler? [Rereading your letter, I saw 
>that you asked the question yourself] I am thinking of the balancing 
>of footnote text over two pages where the number of affected blocks 
>changes with different layouts, that is in the second attempt more 
>blocks could be affected? If you have footnotes on every page - as 
>is usual in the texts I have to handle - then you would have to 
>handle overlapping transaction subtrees, don't you?

Nested transactions are going to happen; we need to decide if overlaps make 
sense and how to handle them - do they get combined?

My gut feeling is that the areas covered by the transaction handler are 
those areas generated by the FO's that are children of the transaction. To 
use your example concerning footnotes, I don't think that more blocks are 
affected, per se, not in the sense we both mean.

In my 2 examples, in one the transaction changed FO conditions (imposed a 
"break-before") and in the other the transaction changed the layout 
environment (shortens the span area for column balancing). If we choose to 
allow footnote body placement into following pages, I think we cover that by 
changing the layout environment - i.e. adjusting the footnote-reference-area 
sizes. So the other fo:block's on that subsequent page are _indirectly_ 
affected - they have less main-reference-area to lay out into - but they are 
not directly included in the transaction.

I think it's OK, even desirable, for the transaction (handlers) to possess 
knowledge of the layout strategy of their FO children. E.g. a transaction 
handler for a complicated footnote situation knows what our strategy is for 
placing footnote body content, and this is how it knows to influence the 
layout geometry to cause an optimum result. A concrete implementation of a 
Transaction _should_ know about stuff like this - it is collecting that 
knowledge so as to keep the code of its children as clean as possible.

We need to do up some design notes, I think, that detail our layout 
strategies for various situations. I can document the column-balancing one, 
and also keeps. I think it is best to have this fixed on paper before we code.
In the case of footnotes, we need to anticipate all possible situations
and write down the solutions, first, I believe, otherwise we're in
trouble. :-)

Regards,
Arved
Fairly Senior Software Type
e-plicity (http://www.e-plicity.com)
Wireless * B2B * J2EE * XML --- Halifax, Nova Scotia

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org

Re: RFC: Tentative Ideas for Improvements [Long]

Posted by Fotis Jannidis <fo...@lrz.uni-muenchen.de>.

Arved, 

very interesting ideas. Just a couple of question: 

Don't we need a second pass for some informations anyway or at 
least some book-keeping for them (p.e. fo:retrieve-marker)? But is 
the idea to preprocess the FO tree (in portions of page-sequences) 
heading in that direction anyway? 

How do you define the start and end point of areas which should 
come under one transaction handler? [Rereading your letter, I saw 
that you asked the question yourself] I am thinking of the balancing 
of footnote text over two pages where the number of affected blocks 
changes with different layouts, that is in the second attempt more 
blocks could be affected? If you have footnotes on every page - as 
is usual in the texts I have to handle - then you would have to 
handle overlapping transaction subtrees, don't you?

Fotis



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org