You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Dario Laera <la...@cs.unibo.it> on 2008/11/03 10:30:12 UTC
Re: Choosing a better threshold in line breaking
Il giorno 28/ott/08, alle ore 13:53, Vincent Hennebert ha scritto:
> If you could run statistics on more real-life documents (how often is
> the first run without hyphenation sufficient, the third run required,
> justified and left-aligned text, single / two-column on A4 paper,
> etc),
> that would be fantastic.
I've run the examples in the repository with some debug info, you can
find the refined output in the attachment. The interesting output
lines are those with high "lines" value (to see when long paragraphs
becomes difficult to break) and those following two consecutive "RETRY".
hyphen.fo was the most interesting case: it clearly states that even
for medium paragraph (10 lines) th=1.0 plus hyphenation is not enough.
This is a bit language dependent: italian paragraphs don't need to
increase the threshold, I think this is due to the fact that italian
lang allows for more hyphenation points than other langs like english,
but I think we shouldn't care about this issue. I tried then to format
hyphen.fo using at the second try th=5.0, and it was always enough
regardless of the alignment. Finally, I've formatted the same fo with
hyphenation disabled and the result was mixed: sometimes the third
attempt was necessary, some others not.
The franklin*.fo files contains paragraphs longer than hyphen.fo, but
with hyphenation disabled, so those paragraphs gets broken at the
second attempt even if they are start-aligned.
In inhprop.fo a center-aligned non-hyphenated paragraph 4 lines long
fall down in the forced mode, changing the alignment would make the
third attempt unnecessary.
The results of these tests can be summarized as follows:
* non-hyphenated paragraphs are handled efficiently for both justify
and start alignment as the second attempt is usually sufficient
(steps: 1.0, 5.0, 20.0);
* hyphenated paragraphs should benefit from a th=5.0 attempt that
isn't performed (steps: 1.0, 1.0 + hyph, 20.0 + hyph);
* center-aligned mid/long sized paragraphs are likely to need
threshold higher than 5.0.
If you have a typical user xsl-fo file which behavior is worth to be
examined send it to me, please.
Dario