You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@greenmail.ch> on 2005/05/10 18:38:49 UTC

[VOTE] Merge Knuth branch back into HEAD

I'm not where I would like to be, yet (with table layout). Over all,
there is still a number of problems to be solved. These are (potentially
incomplete list):

- Table layout including header, footer, spans and borders (*)
- Markers
- before-floats and footnotes
- keeps and breaks on tables
- strength values for keeps
- the other known table-related problems as documented on the Wiki
- change of available IPD and BPD between pages
- last-page
- column-spanning and column balancing

(*) ATM I've got the basic algorithm but I'm stuck with the many details
that arise from the collapsing border model. I'm going to back off from
this for now and instead I'm going to try and at least make the separate
border model work. This model doesn't have these nasty interactions
between cells that keep my head spinning. Painting this stuff on paper
is hard enough, implementing it is even harder.

Still, we're at a point where we should finally say yes or no to further
pursuing the new page breaking approach. Merging the branch back into
HEAD means a step back for a few features and on the other side a step
forward especially for keeps. I got the impression that the team is
pretty much committed to continue on this path and this vote should
confirm that.

My vote:
At this point I'm only able to give a +0.95 where the missing 0.05 is
due to the fact that the Knuth approach has given me headache after
headache. There are pros and cons to the whole approach. I still cannot
fully exclude the possibility that we're not going to hit a dead end.
And I'm still not comfortable with the complexity in certain areas,
although you could probably say that it would be similarly complex with
the old approach. Anyway, I've gotten used to thinking in terms of boxes,
glue and penalties. Were it not for tables, my vote would have been
clearer.

Jeremias Maerki

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

Looks like we're ok for the merge. Shall I simply do it or does anyone
want to have a chance to commit his work before that? I'm not doing it
before I've fixed my current hardware problems anyway.

Jeremias Maerki

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by "Peter B. West" <li...@pbw.id.au>.

Jeremias Maerki wrote:
> 
> Still, we're at a point where we should finally say yes or no to further
> pursuing the new page breaking approach. Merging the branch back into
> HEAD means a step back for a few features and on the other side a step
> forward especially for keeps. I got the impression that the team is
> pretty much committed to continue on this path and this vote should
> confirm that.

The team has made remarkable progress in this.  My congratulations. 
 From the outside, I share the reservations expressed by Jeremias and 
Simon.  It will be an extremely impressive achievement if they are all 
resolved.

Peter
-- 
Peter B. West <http://cv.pbw.id.au/>
Folio <http://defoe.sourceforge.net/folio/> <http://folio.bkbits.net/>

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

On 13.05.2005 18:01:44 Andreas L. Delmelle wrote:
<snip/>
> > if you would like to take a stab at the collapsed border
> > resolution, then please do. I'll leave it aside for the moment and will
> > concentrate on implementing or fixing the rest of the important features
> > for table layout (BPD/height props, breaks, keeps, etc.).
> 
> Will certainly do so. Currently, most of my time was still spent on catching
> up with you guys... Hopefully there will be no more unforeseen circumstances
> that keep me away for a few months, so I can finally get some really
> constructive work done on those 'ideas' of mine...

Wonderful! If I can help in any way, just yell. And don't forget to
remind me if I don't write that nasty example I promised. :-)

> <snip />
> > I hope I wasn't disrespectful by snipping out and not replying to parts
> > of your post.
> 
> Well, I wouldn't worry too much about that. I'm rather thick-skinned, if you
> know what I mean...
> 
> And if my initial reply to the vote came across as disrespectful to
> you --since you obviously have invested a great deal of your time into that
> algorithm, and I made it seem like it wasn't worth much-- my apologies of
> course!

There was absolutely no problem there.

Jeremias Maerki

RE: [VOTE] Merge Knuth branch back into HEAD

Posted by "Andreas L. Delmelle" <a_...@pandora.be>.

> -----Original Message-----
> From: Jeremias Maerki [mailto:dev.jeremias@greenmail.ch]

Hi Jeremias,

<snip />
> > [Me:]
> > ... I guess you could also see it as active (collapsing) vs. passive
> > (separated) border segments. In the collapsing model, the borders are
> > 'alive', which seems to be exactly the part that is causing the
> > headaches...
>
> Sort of, although I don't like the analogy with active and passive
> because it introduces more terms in an already complicated environment.
>

I understand perfectly, but this was more meant as an analogy, to illustrate
the inherent difficulty when dealing with collapsing borders... I have
absolutely no intention of introducing/using these terms in the
algorithm/the code (if that's what you're worried about).

<snip />
> if you would like to take a stab at the collapsed border
> resolution, then please do. I'll leave it aside for the moment and will
> concentrate on implementing or fixing the rest of the important features
> for table layout (BPD/height props, breaks, keeps, etc.).

Will certainly do so. Currently, most of my time was still spent on catching
up with you guys... Hopefully there will be no more unforeseen circumstances
that keep me away for a few months, so I can finally get some really
constructive work done on those 'ideas' of mine...

<snip />
> I hope I wasn't disrespectful by snipping out and not replying to parts
> of your post.

Well, I wouldn't worry too much about that. I'm rather thick-skinned, if you
know what I mean...

And if my initial reply to the vote came across as disrespectful to
you --since you obviously have invested a great deal of your time into that
algorithm, and I made it seem like it wasn't worth much-- my apologies of
course!


Cheers,

Andreas

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

On 12.05.2005 22:00:54 Andreas L. Delmelle wrote:
<snip/>
> I get carried away sometimes :-)

Happens to me all the time. This stuff gets so complicated.

<snip/>

> > I can see the potential benefit by not having to take all the
> > influencing border sources into account, but precalculating some border
> > and thus optimizing the code a bit. The beauty of the current approach
> > IMO lies within the concentration of the calculation in one spot. I
> > think your approach would make the border resolution more decentralized
> > and therefor harder to track down in the already complex maze.
> 
> Partly agreed. The more I think about the starting and ending GridUnits as
> row-boundaries, the more it seems like much of the logic I saw 'moving up'
> to the row-level would ultimately have to end up in the GridUnit anyway.
> Same for the Body, so, very much like it is now.
> 
> Still, I believe we can keep the calculation in one central spot, only split
> it up a bit, and steer the parts of that calculation from above (or below,
> depending on the view), so that certain parts get executed less frequently.

That would be good. I thought about doing something like that but
decided to get the functionality done before going into optimization.

> i.e. something like TableRowIterator.resolveBorders() on the one hand
> finishes the previous row's GridUnit's after-border segments --if any-- and
> triggers preparatory work for the next row's GridUnits' resolveBorders(),
> while the GridUnits at their end do the same for the before-borders for the
> next row (or after-borders of the body/table on breaks), so the next time
> the row-iterator arrives at resolveBorders() etc. --and that last call could
> also be forced from a break-situation, in the middle of a real 'physical'
> row, in order to finish the after-borders of the table/body/footer on the
> break, which is the only situation in which the table and body borders
> become more relevant.
> 
> This kind of interaction doesn't strike me as increasing complexity that
> much.

Good, glad to have a hand to help. :-)

> Quite on the contrary, since the resolving of the borders also happens
> at row-level, which seems to be an attractive place to deal with breaks, as
> we should have access to all related border segments in one spot.
> Although I may be missing some very nasty consequences here... :-/

There are a few. I'd appreciate if you would invest the time to
investigate this. The more people know about this, the better.

> I'll
> think it over a bit more first, but IMO, possibly having to decide between 5
> or 6 sets of border-specs for the segments of, say 10 grid-units is making
> matters more complex than
> - rule out 3 or 4 sets once, for all 10 of them
> - decide between 2 sets, one GridUnit at a time
> ...
> - decide between 2 sets,
>   or possibly finish up in case of a break, one at a time
> 
> One immediate constraint that strikes me is that we would, strictly
> speaking, have no definite values for the border-widths of the after-border
> segments of a row's GridUnits after the first pass, since the border-widths
> for these segments could still be altered by the call to
> TRIter.resolveBorders() that would be made between the current row and the
> next row (or break)... Ultimately, we would only have a full idea on the
> effective settings of a segment after the *last* GridUnit it belongs to has
> called resolveBorders(), or *after* an effective break has triggered
> TRIter.resolveBorders().

yep.

> Can we give the border-resolution a head-start, say create a 'buffer' of
> resolved border-specs for up to five rows ahead of the main layout? Hmm...
> maybe a bit ambitious... Two iterators running synchronously, but the
> 'heavy' one only starts after the 'light' one has reached five, from that
> point on, alternate between the two iterators until the first one runs out
> of rows, use up the buffer...? Now, that seems interesting, *if* at all
> manageable of course.

At any rate, you can already look ahead with the TableRowIterator as
much as you like. What it's currently missing is a GC mechanism for the
rows that are not needed anymore. Means we still have potential problems
on long tables.

<snip/>

> > > When the first row is finished, we would have two unfinished segments of
> > > after-borders. Make these available to the next row as preliminary
> > > before-border segments, which it can then resolve with its own. Next, we
> > > already know that this row contains three cells, so we need some sort of
> > > distribution method here --i.e. calculations based on relevant GridUnit
> > > IPD?-- to give an indication as to which segment(s) need to be
> > > split into how many parts....
> >
> > Now you lost me. I see a border segment as identical to a GridUnits
> > border (not a cell's border). That's the impression I got from the spec.
> > Are you talking about handling column spanning here?
> >
> 
> Sorry, I must have been getting a bit sleepy. Picture the same table
> upside-down to begin with... But indeed, it's much better to just have one
> segment per side of a GridUnit, since the number of them on the before and
> after side is obviously a constant for every row. (--Where *was* my head
> at?)
> 
> Anyway, I guess my real point was that, IIC, in a collapsing model, one
> GridUnit's after-border segment can be the very same object/instance as the
> corresponding before-border segment of the GridUnit immediately below (or a
> Body's after-border segment on breaks). The table-components literally
> 'share' a border-segment... In a way, the segments can be said to exist
> independently from the GridUnits, a segment *belongs to* one or two
> GridUnits and the other table-components.

That's true. This has a particular implication since I'm calculating
every border segment twice ATM.

> This can also be considered true of the separate border model,

Here I disagree. Per border segment you have two potentially different
half-border specifications. If you can hold that in that separate data
structure, then it's ok again.

> but there,
> 'resolving' means 'adding to' rather than 'comparing precedence and possibly
> replacing'. I guess you could also see it as active (collapsing) vs. passive
> (separated) border segments. In the collapsing model, the borders are
> 'alive', which seems to be exactly the part that is causing the headaches...

Sort of, although I don't like the analogy with active and passive
because it introduces more terms in an already complicated environment.

Andreas, if you would like to take a stab at the collapsed border
resolution, then please do. I'll leave it aside for the moment and will
concentrate on implementing or fixing the rest of the important features
for table layout (BPD/height props, breaks, keeps, etc.).

I hope I wasn't disrespectful by snipping out and not replying to parts
of your post. I get the gist of what you have in mind and I think it's
at least a way to improve performance (even though we still don't know
how fast/slow the current code is). Concerning the collapsing borders
and their implication for the combined element generation, I urge you to
play through the RowBorder2 example in the Wiki so you see all the
problems involved. TableStepper.getNextStep() is also certainly a key
point in the whole discussion.

Jeremias Maerki

RE: [VOTE] Merge Knuth branch back into HEAD

Posted by "Andreas L. Delmelle" <a_...@pandora.be>.

> -----Original Message-----
> From: Jeremias Maerki [mailto:dev.jeremias@greenmail.ch]
>

Hi,

<snip />
> Hmm, I think you got the wrong impression. It's not that I'm having
> problems with the border resolution. This actually works fine by now even
> if it might need some additional tweaking for calculating new
> constellations in break conditions.

Ayaa... my mistake indeed! I don't even want to think about what a waste of
keystrokes that means --well, if I refrain from typing all ugly names I feel
like calling myself, this might begin to compensate :-)

<snip />
> I probably don't get what you're targetting at but one thing disturb me
> here: you may not have a Row instance.
>

I always seem to forget that part... *and* it seems to have come across
differently than I intended: when I capitalized 'Row' here, I did *not* mean
the Java instances, but it refers to a 'logical' Row --which indeed may or
may not be 'physically' available as a Row instance, but the
groups/sequences of GridUnits in question will be, and that always seems
more important to me. Well, what can I say? The starting and ending
GridUnits are the row-boundaries. My Row has more or less the same status as
a GridUnit, when viewed as a virtual, one-by-one Cell. I get carried away
sometimes :-)
Then again, maybe this didn't come out well enough, but if part of the
condition is that 'p(row) > p(column) > ...', we *should* per se also have
at least TableRows and TableColumns at our disposal on which these
precedences were specified. If both of them are absent in the source
document the full condition would look something like:
p(table) > p(body) > p(cell) and p(row) = p(column)

Although it was of course a way too rough and simplified description... I
immediately noticed a few errors right after the post was out --first
border-before on a page after a break should be the body's/header's, not the
table's, and even _that_ depends-- but even then, it seems I was still a few
steps behind...

> > Mind the Capitals, and what I have already mentioned in a previous
> > post --about doing part of the resolving at row-level-- begins
> > to make a bit more sense now.
> > When the BodyLM is initialized, you can already decide between
> > 'table' and 'body' borders
>
> (for non-break conditions)

Yes, very important note indeed.

>
> > and pass that result to the RowLM,
>
> I don't use the RowLM anymore. There's only the TableLM, the
> TableContentLM and the CellLM. I know I should have removed the obsolete
> LMs by now. I simply was too deep in the mud to notice.

No worries. As indicated above: it is more the 'conceptual' row that counts,
so...

>
> The next best place where the functionality of the RowLM lies is the
> TableRowIterator. You'd probably pass this one the result.

... yes, indeed. Whatever operates at 'row-level' and thus is in a position
to co-ordinate layout for a sequence of GridUnits that together form a row
(or a row-group).

> I can see the potential benefit by not having to take all the
> influencing border sources into account, but precalculating some border
> and thus optimizing the code a bit. The beauty of the current approach
> IMO lies within the concentration of the calculation in one spot. I
> think your approach would make the border resolution more decentralized
> and therefor harder to track down in the already complex maze.

Partly agreed. The more I think about the starting and ending GridUnits as
row-boundaries, the more it seems like much of the logic I saw 'moving up'
to the row-level would ultimately have to end up in the GridUnit anyway.
Same for the Body, so, very much like it is now.

Still, I believe we can keep the calculation in one central spot, only split
it up a bit, and steer the parts of that calculation from above (or below,
depending on the view), so that certain parts get executed less frequently.
i.e. something like TableRowIterator.resolveBorders() on the one hand
finishes the previous row's GridUnit's after-border segments --if any-- and
triggers preparatory work for the next row's GridUnits' resolveBorders(),
while the GridUnits at their end do the same for the before-borders for the
next row (or after-borders of the body/table on breaks), so the next time
the row-iterator arrives at resolveBorders() etc. --and that last call could
also be forced from a break-situation, in the middle of a real 'physical'
row, in order to finish the after-borders of the table/body/footer on the
break, which is the only situation in which the table and body borders
become more relevant.

This kind of interaction doesn't strike me as increasing complexity that
much. Quite on the contrary, since the resolving of the borders also happens
at row-level, which seems to be an attractive place to deal with breaks, as
we should have access to all related border segments in one spot.
Although I may be missing some very nasty consequences here... :-/ I'll
think it over a bit more first, but IMO, possibly having to decide between 5
or 6 sets of border-specs for the segments of, say 10 grid-units is making
matters more complex than
- rule out 3 or 4 sets once, for all 10 of them
- decide between 2 sets, one GridUnit at a time
...
- decide between 2 sets,
  or possibly finish up in case of a break, one at a time

One immediate constraint that strikes me is that we would, strictly
speaking, have no definite values for the border-widths of the after-border
segments of a row's GridUnits after the first pass, since the border-widths
for these segments could still be altered by the call to
TRIter.resolveBorders() that would be made between the current row and the
next row (or break)... Ultimately, we would only have a full idea on the
effective settings of a segment after the *last* GridUnit it belongs to has
called resolveBorders(), or *after* an effective break has triggered
TRIter.resolveBorders().

Can we give the border-resolution a head-start, say create a 'buffer' of
resolved border-specs for up to five rows ahead of the main layout? Hmm...
maybe a bit ambitious... Two iterators running synchronously, but the
'heavy' one only starts after the 'light' one has reached five, from that
point on, alternate between the two iterators until the first one runs out
of rows, use up the buffer...? Now, that seems interesting, *if* at all
manageable of course.

<snip />
> > What seemed a bit awkward while I was browsing through the
> > relevant code was the constant need to pass the 'side' of
> > the GridUnit around ...
>
> "constant need"? There are four calls to GridUnit.resolveBorder() in the
> code, one for each side. There will be a couple of additional ones once
> we have figured out how to resolve (or better store) the borders for the
> break conditions.

Ok, so they just rather quickly caught my eye, and I may have mistakenly
exaggerated their weight here... :-P

<snip />
> > When the first row is finished, we would have two unfinished segments of
> > after-borders. Make these available to the next row as preliminary
> > before-border segments, which it can then resolve with its own. Next, we
> > already know that this row contains three cells, so we need some sort of
> > distribution method here --i.e. calculations based on relevant GridUnit
> > IPD?-- to give an indication as to which segment(s) need to be
> > split into how many parts....
>
> Now you lost me. I see a border segment as identical to a GridUnits
> border (not a cell's border). That's the impression I got from the spec.
> Are you talking about handling column spanning here?
>

Sorry, I must have been getting a bit sleepy. Picture the same table
upside-down to begin with... But indeed, it's much better to just have one
segment per side of a GridUnit, since the number of them on the before and
after side is obviously a constant for every row. (--Where *was* my head
at?)

Anyway, I guess my real point was that, IIC, in a collapsing model, one
GridUnit's after-border segment can be the very same object/instance as the
corresponding before-border segment of the GridUnit immediately below (or a
Body's after-border segment on breaks). The table-components literally
'share' a border-segment... In a way, the segments can be said to exist
independently from the GridUnits, a segment *belongs to* one or two
GridUnits and the other table-components.
This can also be considered true of the separate border model, but there,
'resolving' means 'adding to' rather than 'comparing precedence and possibly
replacing'. I guess you could also see it as active (collapsing) vs. passive
(separated) border segments. In the collapsing model, the borders are
'alive', which seems to be exactly the part that is causing the headaches...


Cheers,

Andreas

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

On 11.05.2005 00:52:21 Andreas L. Delmelle wrote:
<snip/>
> > > Jeremias, what do you mean with complexity in certain areas? Tables
> > > only, or are there other complexities that you perceived as
> > > overwhelming?
> >
> > No, it's mainly the complexity of the collapsed border model ...
> 
> Yes, I've been thinking and reading up on that stuff, and somehow it seems a
> bit --a tiny bit-- simpler if you try to figure out
> 'collapse-with-precedence' first, since you have to decide on a purely
> numerical basis, so it may facilitate translation into an algorithm. The
> 'Eye Catching' question could then be solved as a scenario with fixed
> precedence values for the different styles, plus a factor for the widths,
> etc.

Hmm, I think you got the wrong impression. It's not that I'm having
problems with the border resolution. This actually works fine by now even
if it might need some additional tweaking for calculating new
constellations in break conditions. The design of the resolution is
already prepared to easily handle the "precedence" variant. It's just a
matter of creating an additional subclass (of CollapsingBorderModel).
The data sources for the decisions are there. The real problem lies
within the effects that borders have on the generated combined elements
list after they have been resolved. I'm sorry for not making that clear
enough. Still....(read on below)

> Still, after a look at the code and the Wiki, I had the impression that this
> path hadn't yet been taken into consideration, so hopefully this offers some
> relief...

Hmm, I actually left that away simply because I thought it would be
quite simple. I could be wrong though.

> Starting with the simplest case, a rough description:
> p(table) > p(body) > p(row) > p(column) > p(cell) means
>    table-border for
>      border-start of the first GridUnit in a Row
>      border-end of the last GridUnit in a Row
>      border-before of all GridUnits in the first Row of a (sub)page
>      border-after of all GridUnits in the last Row of a (sub)page
>    row-border for
>      border-before of all GridUnits not in the first Row of a (sub)page
>      border-after of all GridUnits not in the last Row of a (sub)page
>    column-border for
>      border-start for all GridUnits except when first in a Row
>      border-end for all GridUnits except when last in a Row
>    body-borders and cell-borders are overruled

I probably don't get what you're targetting at but one thing disturb me
here: you may not have a Row instance.

> Mind the Capitals, and what I have already mentioned in a previous
> post --about doing part of the resolving at row-level-- begins to make a bit
> more sense now. When the BodyLM is initialized, you can already decide
> between 'table' and 'body' borders

(for non-break conditions)

> and pass that result to the RowLM, 

I don't use the RowLM anymore. There's only the TableLM, the
TableContentLM and the CellLM. I know I should have removed the obsolete
LMs by now. I simply was too deep in the mud to notice.

The next best place where the functionality of the RowLM lies is the
TableRowIterator. You'd probably pass this one the result.

> that
> passes that result OR its own border-specs to its GridUnits, and the
> GridUnits ultimately only have to decide between the relevant 'row'-borders,
> 'column'-borders and their own... I think one would have a hard time getting
> closer to the meaning of 'collapsing' than this approach.

I can see the potential benefit by not having to take all the
influencing border sources into account, but precalculating some border
and thus optimizing the code a bit. The beauty of the current approach
IMO lies within the concentration of the calculation in one spot. I
think your approach would make the border resolution more decentralized
and therefor harder to track down in the already complex maze.

> What seemed a bit awkward while I was browsing through the relevant code was
> the constant need to pass the 'side' of the GridUnit around when resolving
> the border :-/ Still, that seems more like a consequence of delaying the
> entire border-resolving process until the level of the GridUnit is reached.

"constant need"? There are four calls to GridUnit.resolveBorder() in the
code, one for each side. There will be a couple of additional ones once
we have figured out how to resolve (or better store) the borders for the
break conditions.

resolveBorder() calls go straight into determineWinner() calls on the
CollapsingBorderModel. It's not that awkward, is it?

> Also, I was juggling with the idea of creating a BorderSegment object that
> operates in conjunction with the GridUnit, but 'in between and over' Rows as
> it were... Instead of having a GridUnit 'resolve its own borders', the
> BorderSegments 'resolve themselves' at the appropriate time. In essence,
> those segments need to know nothing about 'before' or 'after', 'start' or
> 'end', they just pick the right border spec from the given set. What gave me
> this idea, was Simon's example, where you need information about the
> GridUnits for the full two rows --to know how many segments there are, how
> they are distributed and which sets of border-specs are relevant for each of
> the segments.

That may be an idea. At the moment each resolved border segment is
always calculated twice, for example, once coming from the start side of
one cell and once coming from the end side of the neighbouring cell on
the left. I think that is probably the one downside from my current
approach which, given approriate data strutures, could optimize the
thing a bit.

> When the first row is finished, we would have two unfinished segments of
> after-borders. Make these available to the next row as preliminary
> before-border segments, which it can then resolve with its own. Next, we
> already know that this row contains three cells, so we need some sort of
> distribution method here --i.e. calculations based on relevant GridUnit
> IPD?-- to give an indication as to which segment(s) need to be split into
> how many parts....

Now you lost me. I see a border segment as identical to a GridUnits
border (not a cell's border). That's the impression I got from the spec.
Are you talking about handling column spanning here?

> Then again, it seems only *really* necessary for before- and after-borders.

Not really. If you want to be thorough you'd have to do this for the
other two sides, too, to get more optimization potential.

> The border-specs for the vertical border segments could be made available to
> a GridUnit through the Column (? via the Row's column list: end-border of
> previous GridUnit = the resolved start-border of the current GridUnit's
> Column --Or am I thinking too linear --too LRTB, maybe?)

No, that's about it.

> In theory --here I go again...-- it would then be the BorderSegments that
> need information on the border specs on Table/Body/Row/(Column?)/

(column group)/ (=spanned column def, which is the only thing that is
left out ATM)

> Cell for at
> most two cells at the same time. I don't know if, in practice, this idea
> would save much compared to what you currently have... but it somehow seems
> attractive, especially in combination with the approach of resolving in
> different stages.

I don't think it would have saved much. We'd certainly have a little less
computation per border segment but I think it would make the code more
complex (border resolution more distributed) which, for tables, is not
very welcome.

> Hope this helps! :-)

Well, as I said, I don't think the border resolution per se is the
problem. It's the effects into the element generation. 

Jeremias Maerki

RE: [VOTE] Merge Knuth branch back into HEAD

Posted by "Andreas L. Delmelle" <a_...@pandora.be>.

> -----Original Message-----
> From: Jeremias Maerki [mailto:dev.jeremias@greenmail.ch]
>
> On 10.05.2005 20:41:19 Simon Pepping wrote:
>

Hi guys,

For starters: my vote is +1.

I agree with Simon, and also very much feel like we're on the right track
with this. Sure, it will *still* take some work...

<snip />
> > Jeremias, what do you mean with complexity in certain areas? Tables
> > only, or are there other complexities that you perceived as
> > overwhelming?
>
> No, it's mainly the complexity of the collapsed border model ...

Yes, I've been thinking and reading up on that stuff, and somehow it seems a
bit --a tiny bit-- simpler if you try to figure out
'collapse-with-precedence' first, since you have to decide on a purely
numerical basis, so it may facilitate translation into an algorithm. The
'Eye Catching' question could then be solved as a scenario with fixed
precedence values for the different styles, plus a factor for the widths,
etc.

Still, after a look at the code and the Wiki, I had the impression that this
path hadn't yet been taken into consideration, so hopefully this offers some
relief...

Starting with the simplest case, a rough description:
p(table) > p(body) > p(row) > p(column) > p(cell) means
   table-border for
     border-start of the first GridUnit in a Row
     border-end of the last GridUnit in a Row
     border-before of all GridUnits in the first Row of a (sub)page
     border-after of all GridUnits in the last Row of a (sub)page
   row-border for
     border-before of all GridUnits not in the first Row of a (sub)page
     border-after of all GridUnits not in the last Row of a (sub)page
   column-border for
     border-start for all GridUnits except when first in a Row
     border-end for all GridUnits except when last in a Row
   body-borders and cell-borders are overruled

Mind the Capitals, and what I have already mentioned in a previous
post --about doing part of the resolving at row-level-- begins to make a bit
more sense now. When the BodyLM is initialized, you can already decide
between 'table' and 'body' borders and pass that result to the RowLM, that
passes that result OR its own border-specs to its GridUnits, and the
GridUnits ultimately only have to decide between the relevant 'row'-borders,
'column'-borders and their own... I think one would have a hard time getting
closer to the meaning of 'collapsing' than this approach.

What seemed a bit awkward while I was browsing through the relevant code was
the constant need to pass the 'side' of the GridUnit around when resolving
the border :-/ Still, that seems more like a consequence of delaying the
entire border-resolving process until the level of the GridUnit is reached.

Also, I was juggling with the idea of creating a BorderSegment object that
operates in conjunction with the GridUnit, but 'in between and over' Rows as
it were... Instead of having a GridUnit 'resolve its own borders', the
BorderSegments 'resolve themselves' at the appropriate time. In essence,
those segments need to know nothing about 'before' or 'after', 'start' or
'end', they just pick the right border spec from the given set. What gave me
this idea, was Simon's example, where you need information about the
GridUnits for the full two rows --to know how many segments there are, how
they are distributed and which sets of border-specs are relevant for each of
the segments.

When the first row is finished, we would have two unfinished segments of
after-borders. Make these available to the next row as preliminary
before-border segments, which it can then resolve with its own. Next, we
already know that this row contains three cells, so we need some sort of
distribution method here --i.e. calculations based on relevant GridUnit
IPD?-- to give an indication as to which segment(s) need to be split into
how many parts....

Then again, it seems only *really* necessary for before- and after-borders.
The border-specs for the vertical border segments could be made available to
a GridUnit through the Column (? via the Row's column list: end-border of
previous GridUnit = the resolved start-border of the current GridUnit's
Column --Or am I thinking too linear --too LRTB, maybe?)

In theory --here I go again...-- it would then be the BorderSegments that
need information on the border specs on Table/Body/Row/(Column?)/Cell for at
most two cells at the same time. I don't know if, in practice, this idea
would save much compared to what you currently have... but it somehow seems
attractive, especially in combination with the approach of resolving in
different stages.

Hope this helps! :-)

Cheers,

Andreas

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

On 10.05.2005 20:41:19 Simon Pepping wrote:
> My worry with the new approach is performance: We know that the
> algorithms require quite some computational steps, but we have no idea
> whether in the end performance on a large document will be acceptable
> or not. (Perhaps Luca has some experimental evidence from his own
> implementation?)

I still have some performance comparisons on my todo list as preparation
for the ApacheCon session. I can run the examples through the new code
to get an idea. That's a no-brainer with my API wrapper. I'll keep you
posted.

> Jeremias, what do you mean with complexity in certain areas? Tables
> only, or are there other complexities that you perceived as
> overwhelming?

No, it's mainly the complexity of the collapsed border model plus the
implications from row spanning and if you go further: handling
min/opt/max stuff which I dared to simply ignore. There are so many
possible interactions. Take the RowBorder2 example. It took me a whole
day to run on paper. And it's still not covering all the possible
cases. If you remove the column span in the header and do some nasty
stuff with the border withs you can create real mean examples. I intend
to write one when I'm in a better mood.

Jeremias Maerki

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Simon Pepping <sp...@leverkruid.nl>.

On Tue, May 10, 2005 at 06:38:49PM +0200, Jeremias Maerki wrote:
> Still, we're at a point where we should finally say yes or no to further
> pursuing the new page breaking approach. Merging the branch back into
> HEAD means a step back for a few features and on the other side a step
> forward especially for keeps. I got the impression that the team is
> pretty much committed to continue on this path and this vote should
> confirm that.
> 
> My vote:
> At this point I'm only able to give a +0.95 where the missing 0.05 is
> due to the fact that the Knuth approach has given me headache after
> headache. There are pros and cons to the whole approach. I still cannot
> fully exclude the possibility that we're not going to hit a dead end.
> And I'm still not comfortable with the complexity in certain areas,
> although you could probably say that it would be similarly complex with
> the old approach. Anyway, I've gotten used to thinking in terms of boxes,
> glue and penalties. Were it not for tables, my vote would have been
> clearer.

I have a strong impression of the new approach. In fact I am feeling
that with these algorithms FOP is going to have a superior approach to
page breaking (in addition to line breaking), and hopefully to table
breaking.

My worry with the new approach is performance: We know that the
algorithms require quite some computational steps, but we have no idea
whether in the end performance on a large document will be acceptable
or not. (Perhaps Luca has some experimental evidence from his own
implementation?)

Jeremias, what do you mean with complexity in certain areas? Tables
only, or are there other complexities that you perceived as
overwhelming?

Despite the uncertainties, my vote is +1.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Glen Mazza <gm...@apache.org>.

Sounds good.  +1.

Thanks,
Glen

Jeremias Maerki wrote:

>I'm not where I would like to be, yet (with table layout). Over all,
>there is still a number of problems to be solved. These are (potentially
>incomplete list):
>
>- Table layout including header, footer, spans and borders (*)
>- Markers
>- before-floats and footnotes
>- keeps and breaks on tables
>- strength values for keeps
>- the other known table-related problems as documented on the Wiki
>- change of available IPD and BPD between pages
>- last-page
>- column-spanning and column balancing
>
>(*) ATM I've got the basic algorithm but I'm stuck with the many details
>that arise from the collapsing border model. I'm going to back off from
>this for now and instead I'm going to try and at least make the separate
>border model work. This model doesn't have these nasty interactions
>between cells that keep my head spinning. Painting this stuff on paper
>is hard enough, implementing it is even harder.
>
>Still, we're at a point where we should finally say yes or no to further
>pursuing the new page breaking approach. Merging the branch back into
>HEAD means a step back for a few features and on the other side a step
>forward especially for keeps. I got the impression that the team is
>pretty much committed to continue on this path and this vote should
>confirm that.
>
>My vote:
>At this point I'm only able to give a +0.95 where the missing 0.05 is
>due to the fact that the Knuth approach has given me headache after
>headache. There are pros and cons to the whole approach. I still cannot
>fully exclude the possibility that we're not going to hit a dead end.
>And I'm still not comfortable with the complexity in certain areas,
>although you could probably say that it would be similarly complex with
>the old approach. Anyway, I've gotten used to thinking in terms of boxes,
>glue and penalties. Were it not for tables, my vote would have been
>clearer.
>
>Jeremias Maerki
>
>
>  
>

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Jeremias Maerki <de...@greenmail.ch>.

On 11.05.2005 10:45:41 Chris Bowditch wrote:
> I just tried running a sample FO that contained markers and got a nasty error. 
> Are they broken due to the changes for Knuth page breaking. 

Yes.

> Do you anticipate any pain in fixing them?

I can't tell, yet. I've simply ignored markers for now. But I don't
think this will cause too much pain. The implementation will be somewhat
different. I'll get to that as soon as got the separate border model
working on tables. The latter looks like it is relatively easy to manage
(compared to the collapsing border model, see the new RowBorder3 example).

Jeremias Maerki

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by Chris Bowditch <bo...@hotmail.com>.

Jeremias Maerki wrote:

> I'm not where I would like to be, yet (with table layout). Over all,
> there is still a number of problems to be solved. These are (potentially
> incomplete list):
> 
> - Table layout including header, footer, spans and borders (*)
> - Markers
> - before-floats and footnotes
> - keeps and breaks on tables
> - strength values for keeps
> - the other known table-related problems as documented on the Wiki
> - change of available IPD and BPD between pages
> - last-page
> - column-spanning and column balancing

I just tried running a sample FO that contained markers and got a nasty error. 
Are they broken due to the changes for Knuth page breaking. Do you anticipate 
any pain in fixing them?

<snip/>

> My vote:
> At this point I'm only able to give a +0.95 where the missing 0.05 is
> due to the fact that the Knuth approach has given me headache after
> headache. There are pros and cons to the whole approach. I still cannot
> fully exclude the possibility that we're not going to hit a dead end.
> And I'm still not comfortable with the complexity in certain areas,
> although you could probably say that it would be similarly complex with
> the old approach. Anyway, I've gotten used to thinking in terms of boxes,
> glue and penalties. Were it not for tables, my vote would have been
> clearer.

I understand why you are not 100% sure on this vote. However, I still believe 
we are making progress. Im not convinced the Knuth approach leads to a dead 
end. So heres my +1.

I understand peoples concerns on performance. I fully expect it to be slow 
once we get it working. I believe we should start looking for optimizations 
and time saving ideas once we have a solution that is working for most 
scenarios. If we try to make optimisations now, then they will be undone once 
we implement the missing features.

Chris

Re: [VOTE] Merge Knuth branch back into HEAD

Posted by The Web Maestro <th...@gmail.com>.

+1

On May 10, 2005, at 9:38 AM, Jeremias Maerki wrote:
> I'm not where I would like to be, yet (with table layout). Over all,
> there is still a number of problems to be solved. These are 
> (potentially
> incomplete list):
>
> - Table layout including header, footer, spans and borders (*)
> - Markers
> - before-floats and footnotes
> - keeps and breaks on tables
> - strength values for keeps
> - the other known table-related problems as documented on the Wiki
> - change of available IPD and BPD between pages
> - last-page
> - column-spanning and column balancing
>
> (*) ATM I've got the basic algorithm but I'm stuck with the many 
> details
> that arise from the collapsing border model. I'm going to back off from
> this for now and instead I'm going to try and at least make the 
> separate
> border model work. This model doesn't have these nasty interactions
> between cells that keep my head spinning. Painting this stuff on paper
> is hard enough, implementing it is even harder.
>
> Still, we're at a point where we should finally say yes or no to 
> further
> pursuing the new page breaking approach. Merging the branch back into
> HEAD means a step back for a few features and on the other side a step
> forward especially for keeps. I got the impression that the team is
> pretty much committed to continue on this path and this vote should
> confirm that.
>
> My vote:
> At this point I'm only able to give a +0.95 where the missing 0.05 is
> due to the fact that the Knuth approach has given me headache after
> headache. There are pros and cons to the whole approach. I still cannot
> fully exclude the possibility that we're not going to hit a dead end.
> And I'm still not comfortable with the complexity in certain areas,
> although you could probably say that it would be similarly complex with
> the old approach. Anyway, I've gotten used to thinking in terms of 
> boxes,
> glue and penalties. Were it not for tables, my vote would have been
> clearer.
>
> Jeremias Maerki
>
>

Regards,

Web Maestro Clay
-- 
<th...@gmail.com> - <http://homepage.mac.com/webmaestro/>
My religion is simple. My religion is kindness.
- HH The 14th Dalai Lama of Tibet