You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Andreas L Delmelle <a_...@pandora.be> on 2006/09/01 01:21:22 UTC
Re: Implementation of hyphenation-keep property
On Aug 31, 2006, at 20:59, Jeremias Maerki wrote:
Yeah, it was a lot, wasn't it? :)
Actually, I was preparing the post myself as yours came in, so I
decided to c&p it into a reply, since it did seem to address the same
basic issue: interaction between line- and page-breaking.
> What I can deduct from this is that my suspicion is probably correct
> that implementating hyphenation-keep will be quite tricky with the
> current code. I assume we have to do a few changes to make page- und
> line-breaking interact more closely (for "changing available IPD"
> etc.).
Looking closer at hyphenation-keep: this indeed seems very tricky in
the current situation.
I got the idea while debugging the behavior when processing the
disabled testcase 'page-breaking_4.xml'.
Notice that the FlowLM's getNextKnuthElements() is currently only
called once, which triggers line-breaking for the entire page-
sequence given a LayoutContext with ipd equal to that of the first
page's region-body. The second page is only prepared after all line-
breaks and the first page-break have been computed.
Note that this results in optimal line-layout for non-paginated
media. It would be perfect for an HTML page (or a page with
indefinite height) to compute all line-breaks in one go. Given the
current page-breaking algorithm, which performs outstandingly, nobody
will notice a thing if all pages use the same page-master. From the
moment you add a second one with a slightly narrower/wider region-
body, you're in trouble, it seems. :/
Taking into account the possibility of varying ipd due to different
page-masters or deferred side-floats... this definitely needs to be
changed.
As a side-note, looking at memory consumption: it would at least
offer a chance to perform cleanup if the algorithm jumps from page-
layout to line-layout and back.
Looking at computational complexity: it remains a hypothesis FTM, but
I'm guessing that in certain areas this may even be reduced by the
extra info that would become available to both breaking algorithms if
they interact.
I haven't looked too closely yet at the current implementation of
multi-column layout, but it seems like a mechanism for getting the
changed available ipd already exists somehow. How are the line-breaks
rearranged/recomputed there precisely?
Cheers,
Andreas
Re: Implementation of hyphenation-keep property
Posted by Andreas L Delmelle <a_...@pandora.be>.
On Sep 1, 2006, at 14:45, Jeremias Maerki wrote:
>> <snip />
>> I got the idea while debugging the behavior when processing the
>> disabled testcase 'page-breaking_4.xml'.
>> Notice that the FlowLM's getNextKnuthElements() is currently only
>> called once, which triggers line-breaking for the entire page-
>> sequence given a LayoutContext with ipd equal to that of the first
>> page's region-body. The second page is only prepared after all line-
>> breaks and the first page-break have been computed.
>
> Doesn't sound like total-fit anymore, more like best-fit. Vincent and
> Simon said total-fit for page breaking is very important.
I understand and completely agree, which is why I explicitly looked
for options that would, by default, come down roughly to the same
thing we have now, only refined/corrected.
To me it looks like right now we have a total-fit page-breaking,
combined with a possible non-fit line-breaking. Since line-breaking
is unaware of available bpd, it cannot take into account any
overflows in that direction (and implied ipd-changes for the next
lines).
As you point out, available ipd can only change in case of forced
breaks/span changes. But even then it seems like we cannot precisely
determine which page we're on, unless we'd run the PageBreaker over
the element-list up to that point. That's something I'd like to
avoid, since this would break the total-fit page-breaking (unless the
page-breaks are recomputed afterwards...)
If the goal is to achieve a total-fit for both line- and page-breaks,
and we don't want to waste resources on a whole bunch of unnecessary
break computations, then it seems like the wisest thing to do first,
is to try and see if we can move the page-generation in such a way
that the line-breaking algorithm is always aware of the 'current'
page, while still no actual page-breaks are computed. The latter can
still wait until we have collected the full list of line-breaks.
Could get quite tricky, though. Seems like the line-breaking
algorithm would also need to take into account space-before/-after,
in order to register correct bp-advancements... bp-advancement is not
simply line-height, but line-height + resolved space-before +
resolved space-after. :/
Later,
Andreas
Re: Implementation of hyphenation-keep property
Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 01.09.2006 01:21:22 Andreas L Delmelle wrote:
> On Aug 31, 2006, at 20:59, Jeremias Maerki wrote:
>
> Yeah, it was a lot, wasn't it? :)
> Actually, I was preparing the post myself as yours came in, so I
> decided to c&p it into a reply, since it did seem to address the same
> basic issue: interaction between line- and page-breaking.
>
> > What I can deduct from this is that my suspicion is probably correct
> > that implementating hyphenation-keep will be quite tricky with the
> > current code. I assume we have to do a few changes to make page- und
> > line-breaking interact more closely (for "changing available IPD"
> > etc.).
>
> Looking closer at hyphenation-keep: this indeed seems very tricky in
> the current situation.
>
> I got the idea while debugging the behavior when processing the
> disabled testcase 'page-breaking_4.xml'.
> Notice that the FlowLM's getNextKnuthElements() is currently only
> called once, which triggers line-breaking for the entire page-
> sequence given a LayoutContext with ipd equal to that of the first
> page's region-body. The second page is only prepared after all line-
> breaks and the first page-break have been computed.
Doesn't sound like total-fit anymore, more like best-fit. Vincent and
Simon said total-fit for page breaking is very important.
> Note that this results in optimal line-layout for non-paginated
> media. It would be perfect for an HTML page (or a page with
> indefinite height) to compute all line-breaks in one go. Given the
> current page-breaking algorithm, which performs outstandingly, nobody
> will notice a thing if all pages use the same page-master. From the
> moment you add a second one with a slightly narrower/wider region-
> body, you're in trouble, it seems. :/
Yep, that's one of the current problems.
> Taking into account the possibility of varying ipd due to different
> page-masters or deferred side-floats... this definitely needs to be
> changed.
>
> As a side-note, looking at memory consumption: it would at least
> offer a chance to perform cleanup if the algorithm jumps from page-
> layout to line-layout and back.
> Looking at computational complexity: it remains a hypothesis FTM, but
> I'm guessing that in certain areas this may even be reduced by the
> extra info that would become available to both breaking algorithms if
> they interact.
> I haven't looked too closely yet at the current implementation of
> multi-column layout, but it seems like a mechanism for getting the
> changed available ipd already exists somehow. How are the line-breaks
> rearranged/recomputed there precisely?
Available IPD can currently only change if you have a force page break.
Here the page breaking process restarts. Line-breaks are not recomputed
AFAIK. Otherwise, I'd be much happier. The only elements we currently
have for multi-column layout are:
* Column balancing logic
* Span-change logic (page breaking is interrupted on a span-change,
different block sequence)
Otherwise, a column is just like any other page with only one column.
That's why we can't do keep.within-page, yet.
Jeremias Maerki