You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-dev@xmlgraphics.apache.org by Manuel Mall <ma...@apache.org> on 2006/06/18 13:36:45 UTC

keep...="always" and Knuth penalties

It has been reported on fop-user that the current keep implementation is 
probably too strict, that is instead of weakening a keep condition if 
no solution can be found FOP trunk currently gives up with an error 
message.

I had a quick fiddle in one area and changed the Knuth penalty generated 
for a keep...="always" from INFINITE to INFINITE-1. In my few test 
cases that seem to have resolved the issue.

However, I am interested in comments of those more familiar with the 
mathematical background behind the Knuth algorithm if such a solution 
is appropriate or if there could be unintended side effects, e.g. this 
INFINITE-1 break being chosen even if there are other allowed breaks 
which should be preferred according to the FO input but have higher 
penalties when run through the Knuth algorithm.

Or should we use a more refined approach were we generate initially an 
INFINITE penalty but if the page breaking cannot find a solution we 
reduce the penalty on some/all of those elements given an INFINITE 
penalty because of keeps and run the page breaker again?

Manuel

Re: keep...="always" and Knuth penalties

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

On 19.06.2006 14:45:02 Manuel Mall wrote:
<snip/>
> What is still unclear to me is if it is worthwhile to implement this two 
> pass approach, i.e. use INFINITE penalties first and relax later, or if 
> it is good enough for 99.99% of cases just to start with INFINITE-1 
> penalties for mandatory keeps?

I think we should see if a single-pass with INFINITE-1 does well enough.
I suspect so, but I'm not sure.

Jeremias Maerki

Re: keep...="always" and Knuth penalties

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

On 19.06.2006 15:45:36 Luca Furini wrote:
> Manuel Mall wrote:
> 
> > What is still unclear to me is if it is worthwhile to implement this
> > two pass approach, i.e. use INFINITE penalties first and relax later, or if
> > it is good enough for 99.99% of cases just to start with INFINITE-1
> > penalties for mandatory keeps?
> 
> I think the second pass is necessary, in order to be sure that we are 
> breaking a keep because there really isn't any other alternative. 
> Otherwise, I'm sure that for each value < INFINITE we use, we could create 
> a (contrived) example where the algorithm prefers breaking the keep 
> instead of using a different, legal (but somewhat uglier) break, such 
> behaving in a non-conformant way.
> 
> Reading again the specs, I even start wondering whether it would really be 
> right to allow a break between objects tied by a keep constraint:
> 
> "Each keep condition must also be satisfied, except when this would cause 
> a break condition or a stronger keep condition to fail to be satisfied. If 
> not all of a set of keep conditions of equal strength can be satisfied, 
> then some maximal satisfiable subset of conditions of that strength must 
> be satisfied (together with all break conditions and maximal subsets of 
> stronger keep conditions, if any)."
> 
> It seems to me that the prescribed behaviour requires a keep constraint 
> with force = "always" to be satisfied *always* :-), even if this would 
> mean having some overflowing content. 

Obviously, we disagree here. I read it so that "always" can also be
relaxed if the keep cannot be satisfied. Did anyone check what other
implementations do?

>More than this, even a keep with 
> force = N could be broken only in order to satisfy a keep with greater 
> force, and not to avoid an overflow.
> 
> I seem to recall that in Knuth's paper the author talks about a symbol he 
> introduced in tex to represent a space that could be used as a line break 
> in dire straits, having a penalty value = inf-1 (where inf was the special 
> finite value representing infinity). Maybe we could similarly add some 
> "soft-keep" extensions?
> 
> Regards
>      Luca



Jeremias Maerki

Re: keep...="always" and Knuth penalties

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

On 19.06.2006 14:54:06 Luca Furini wrote:
> Manuel Mall wrote:
> 
> > Yes, there is a force parameter and it seems to be always set to true 
> > for page breaking (and false for line breaking). But it doesn't seem to 
> > guarantee that breaks will be found otherwise we shouldn't get 
> > the "giving up after 50 retries" message.
> 
> The "force" parameter requires the algorithm to find a set of breaks, even 
> if some of them could require an adjustment ratio greater than the 
> maximum one, or smaller that -1 (those breaks are represented by the nodes 
> stored in lastTooLong and lastTooShort).
> 
> During the line breaking, it is first set to "false", so that we first try 
> and look for "good" line breaks, and set to false only if the first tries 
> failed, and we need to create some ugly lines (too short or too long, just 
> as underfull and overfull lines in tex).
> 
> The last call to findBreakingPoints (which, for the page breaking, is also 
> the first one) needs to have force = true, as we need to have some breaks 
> to continue (if force == false and the algorithm fails we have no breaks 
> after the end of the method).
> 
> The "overflow recovery" mechanism tries avoiding the use of a "long" break 
> (i.e. one involving some content to overflow the available space), 
> inserting an empty page and trying againg: this works if one of the 
> following pages will be taller.
> 
> This could happen, for example, when we have a block spanning all columns 
> and then some other content that must flow in the columns: due to the 
> presence of the spanning block, the first page will have some short 
> columns, while the following pages could use the whole body height (this 
> example makes me think that 50 attempts could be even too many, but maybe 
> there are other situations where such a large number is really needed).

Good point. 50 is an arbitrary value I chose, just to make sure. :-) It's
pretty unrealistic to expect more than 5 or 10 empty pages before a page
is found with more available BPD.

> Should this strategy not work (for example, we are trying to place a 30cm 
> image inside a 20cm-high body), instead of throwing the exception we could 
> use the lastTooLong node (which always exists), after removing the "empty 
> page" nodes.

Good idea.


Jeremias Maerki

Re: keep...="always" and Knuth penalties

Posted by Manuel Mall <ma...@apache.org>.

On Monday 19 June 2006 20:25, Jeremias Maerki wrote:
> On 19.06.2006 13:38:56 Manuel Mall wrote:
> > On Monday 19 June 2006 16:45, Jeremias Maerki wrote:
> > > On 18.06.2006 20:57:51 Simon Pepping wrote:
> > > > On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
> >
> > <snip/>
> >
> > > > > Or should we use a more refined approach were we generate
> > > > > initially an INFINITE penalty but if the page breaking cannot
> > > > > find a solution we reduce the penalty on some/all of those
> > > > > elements given an INFINITE penalty because of keeps and run
> > > > > the page breaker again?
> > > >
> > > > I am in favor of this solution. There are generally two
> > > > solutions: increase the tolerance, or force a solution. I think
> > > > FOP already has a force parameter for this purpose.
> > >
> > > +1. Yes, BreakingAlgorithm has a "force" parameter which is
> > > currently set to true for page breaking. There's also a
> > > "threshold". We can probably play with that first. See
> > > LineLayoutManager.findOptimalBreakPoints().
> >
> > Yes, there is a force parameter and it seems to be always set to
> > true for page breaking (and false for line breaking). But it
> > doesn't seem to guarantee that breaks will be found otherwise we
> > shouldn't get the "giving up after 50 retries" message.
>
> Right, but that's because an INFINITE penalty cannot make a legal
> break. We still need to set the penalties to a different value to get
> legal, but very undesired break points.
>
> > Anyone who understands how this force parameter is suppose to work?
>
> I think it's called "secondpass" in Knuth's article. The "force"
> parameter is used to force acceptance of a line whose adjustment
> ratio exceeds the "threshold" value. Otherwise, there could be a
> situation where no break solution is found (which is used in
> line-breaking). In page breaking, the force parameter essentially
> allows overfull pages although that situation's caught by the
> partOverflowRecovery which inserts empty lines/parts. In the end, we
> always find a solution in line-breaking (by letting the line
> overflow) and in page-breaking (by trying to defer the overflowing
> element to a later part/page and giving up after 60 retries).
>

Thanks for the explanation Jeremias. Makes sense to me now.

What is still unclear to me is if it is worthwhile to implement this two 
pass approach, i.e. use INFINITE penalties first and relax later, or if 
it is good enough for 99.99% of cases just to start with INFINITE-1 
penalties for mandatory keeps?

> Jeremias Maerki

Manuel

Re: keep...="always" and Knuth penalties

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

On 19.06.2006 13:38:56 Manuel Mall wrote:
> On Monday 19 June 2006 16:45, Jeremias Maerki wrote:
> > On 18.06.2006 20:57:51 Simon Pepping wrote:
> > > On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
> <snip/>
> > > > Or should we use a more refined approach were we generate
> > > > initially an INFINITE penalty but if the page breaking cannot
> > > > find a solution we reduce the penalty on some/all of those
> > > > elements given an INFINITE penalty because of keeps and run the
> > > > page breaker again?
> > >
> > > I am in favor of this solution. There are generally two solutions:
> > > increase the tolerance, or force a solution. I think FOP already
> > > has a force parameter for this purpose.
> >
> > +1. Yes, BreakingAlgorithm has a "force" parameter which is currently
> > set to true for page breaking. There's also a "threshold". We can
> > probably play with that first. See
> > LineLayoutManager.findOptimalBreakPoints().
> >
> 
> Yes, there is a force parameter and it seems to be always set to true 
> for page breaking (and false for line breaking). But it doesn't seem to 
> guarantee that breaks will be found otherwise we shouldn't get 
> the "giving up after 50 retries" message.

Right, but that's because an INFINITE penalty cannot make a legal break.
We still need to set the penalties to a different value to get legal,
but very undesired break points.

> Anyone who understands how this force parameter is suppose to work?

I think it's called "secondpass" in Knuth's article. The "force" parameter
is used to force acceptance of a line whose adjustment ratio exceeds the
"threshold" value. Otherwise, there could be a situation where no break
solution is found (which is used in line-breaking). In page breaking, the
force parameter essentially allows overfull pages although that
situation's caught by the partOverflowRecovery which inserts empty
lines/parts. In the end, we always find a solution in line-breaking (by
letting the line overflow) and in page-breaking (by trying to defer the
overflowing element to a later part/page and giving up after 60 retries).

Jeremias Maerki

Re: keep...="always" and Knuth penalties

Posted by Vincent Hennebert <vi...@enseeiht.fr>.

FYI: I'm planning to refactor the breaking algorithm in order to
implement floats. I'll see what can be done in this area. Just keep in
touch.

Vincent


Manuel Mall a écrit :
> On Monday 19 June 2006 16:45, Jeremias Maerki wrote:
> 
>>On 18.06.2006 20:57:51 Simon Pepping wrote:
>>
>>>On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
> 
> <snip/>
> 
>>>>Or should we use a more refined approach were we generate
>>>>initially an INFINITE penalty but if the page breaking cannot
>>>>find a solution we reduce the penalty on some/all of those
>>>>elements given an INFINITE penalty because of keeps and run the
>>>>page breaker again?
>>>
>>>I am in favor of this solution. There are generally two solutions:
>>>increase the tolerance, or force a solution. I think FOP already
>>>has a force parameter for this purpose.
>>
>>+1. Yes, BreakingAlgorithm has a "force" parameter which is currently
>>set to true for page breaking. There's also a "threshold". We can
>>probably play with that first. See
>>LineLayoutManager.findOptimalBreakPoints().
>>
> 
> 
> Yes, there is a force parameter and it seems to be always set to true 
> for page breaking (and false for line breaking). But it doesn't seem to 
> guarantee that breaks will be found otherwise we shouldn't get 
> the "giving up after 50 retries" message.
> 
> Anyone who understands how this force parameter is suppose to work?
> 
> 
>>Jeremias Maerki
> 
> 
> Manuel

Re: keep...="always" and Knuth penalties

Posted by Manuel Mall <ma...@apache.org>.

On Monday 19 June 2006 16:45, Jeremias Maerki wrote:
> On 18.06.2006 20:57:51 Simon Pepping wrote:
> > On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
<snip/>
> > > Or should we use a more refined approach were we generate
> > > initially an INFINITE penalty but if the page breaking cannot
> > > find a solution we reduce the penalty on some/all of those
> > > elements given an INFINITE penalty because of keeps and run the
> > > page breaker again?
> >
> > I am in favor of this solution. There are generally two solutions:
> > increase the tolerance, or force a solution. I think FOP already
> > has a force parameter for this purpose.
>
> +1. Yes, BreakingAlgorithm has a "force" parameter which is currently
> set to true for page breaking. There's also a "threshold". We can
> probably play with that first. See
> LineLayoutManager.findOptimalBreakPoints().
>

Yes, there is a force parameter and it seems to be always set to true 
for page breaking (and false for line breaking). But it doesn't seem to 
guarantee that breaks will be found otherwise we shouldn't get 
the "giving up after 50 retries" message.

Anyone who understands how this force parameter is suppose to work?

>
> Jeremias Maerki

Manuel

Re: keep...="always" and Knuth penalties

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

On 18.06.2006 20:57:51 Simon Pepping wrote:
> On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
> > I had a quick fiddle in one area and changed the Knuth penalty generated 
> > for a keep...="always" from INFINITE to INFINITE-1. In my few test 
> > cases that seem to have resolved the issue.
> > 
> > However, I am interested in comments of those more familiar with the 
> > mathematical background behind the Knuth algorithm if such a solution 
> > is appropriate or if there could be unintended side effects, e.g. this 
> > INFINITE-1 break being chosen even if there are other allowed breaks 
> > which should be preferred according to the FO input but have higher 
> > penalties when run through the Knuth algorithm.
> 
> Mathematically INFINITE-1 = INFINITE. Its having a different effect on
> FOP's layout decisions causes conceptual problems. I do not know the
> details of FOP's implementation of INFINITE. In my own implementation
> INFINITE gets special treatment. An effort to use values close to
> Integer.MAX_VALUE (which is my INFINITE) would lead to calculational
> overflow, simply because I do not expect them, and I think I have good
> reason to do so.

AFAIK, there IS special treatment of INFINITE in the existing breaking
algorithm. I don't think we're looking at the mathematical definitition
of INFINITE-1 here. INFINITE-1 is just the highest penalty which does not
prohibit a break. And that's pretty much what we need.

> > Or should we use a more refined approach were we generate initially an 
> > INFINITE penalty but if the page breaking cannot find a solution we 
> > reduce the penalty on some/all of those elements given an INFINITE 
> > penalty because of keeps and run the page breaker again?
> 
> I am in favor of this solution. There are generally two solutions:
> increase the tolerance, or force a solution. I think FOP already has a
> force parameter for this purpose.

+1. Yes, BreakingAlgorithm has a "force" parameter which is currently set
to true for page breaking. There's also a "threshold". We can probably
play with that first. See LineLayoutManager.findOptimalBreakPoints().


Jeremias Maerki

Re: keep...="always" and Knuth penalties

Posted by Simon Pepping <sp...@leverkruid.eu>.

On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:
> I had a quick fiddle in one area and changed the Knuth penalty generated 
> for a keep...="always" from INFINITE to INFINITE-1. In my few test 
> cases that seem to have resolved the issue.
> 
> However, I am interested in comments of those more familiar with the 
> mathematical background behind the Knuth algorithm if such a solution 
> is appropriate or if there could be unintended side effects, e.g. this 
> INFINITE-1 break being chosen even if there are other allowed breaks 
> which should be preferred according to the FO input but have higher 
> penalties when run through the Knuth algorithm.

Mathematically INFINITE-1 = INFINITE. Its having a different effect on
FOP's layout decisions causes conceptual problems. I do not know the
details of FOP's implementation of INFINITE. In my own implementation
INFINITE gets special treatment. An effort to use values close to
Integer.MAX_VALUE (which is my INFINITE) would lead to calculational
overflow, simply because I do not expect them, and I think I have good
reason to do so.

> Or should we use a more refined approach were we generate initially an 
> INFINITE penalty but if the page breaking cannot find a solution we 
> reduce the penalty on some/all of those elements given an INFINITE 
> penalty because of keeps and run the page breaker again?

I am in favor of this solution. There are generally two solutions:
increase the tolerance, or force a solution. I think FOP already has a
force parameter for this purpose.

Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu

Re: keep...="always" and Knuth penalties

Posted by Vincent Hennebert <vh...@gmail.com>.

2006/6/18, Manuel Mall <ma...@apache.org>:
> It has been reported on fop-user that the current keep implementation is
> probably too strict, that is instead of weakening a keep condition if
> no solution can be found FOP trunk currently gives up with an error
> message.
>
> I had a quick fiddle in one area and changed the Knuth penalty generated
> for a keep...="always" from INFINITE to INFINITE-1. In my few test
> cases that seem to have resolved the issue.

In fact you have turned the corresponding element into a legal
breakpoint, which was not the case when the penalty was infinite. Thus
the element was not even considered by the algorithm as a possible
breakpoint.


> However, I am interested in comments of those more familiar with the
> mathematical background behind the Knuth algorithm if such a solution
> is appropriate or if there could be unintended side effects, e.g. this
> INFINITE-1 break being chosen even if there are other allowed breaks
> which should be preferred according to the FO input but have higher
> penalties when run through the Knuth algorithm.

The new legal breakpoints turn out to be feasible breakpoints, but with
very high demerits. In fact there are two different infinities here: the
value of infinite penalties (1000 in Fop, IIC), and the values of
demerits. In Fop the set of best active nodes is initialized to an empty
set, and the corresponding demerits are initialized to
Double.POSITIVE_INFINITY (see layoutmgr.BreakingAlgorithm.BestRecords).
As soon as a feasible breakpoint is found it will be recorded, for its
demerits, even high, will be inferior to POSITIVE_INFINITY. You end up
now with a non-empty set of "very bad" breakpoints.

This should work, as the formula used to compute demerits will favour a
very loose page to a page which ends at a very high penalty item.
However...


> Or should we use a more refined approach were we generate initially an
> INFINITE penalty but if the page breaking cannot find a solution we
> reduce the penalty on some/all of those elements given an INFINITE
> penalty because of keeps and run the page breaker again?

... I think this is a better idea. I'm not completely sure that the
first approach really has no side-effects, and especially if it makes it
possible to match the spec's requirements (last paragraph of §4.8).
Restarting the algorithm with relaxed constraints is more in the spirit
of the Knuth algorithm.


Vincent