You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@jeremias-maerki.ch> on 2006/03/01 11:21:36 UTC

letter-spacing

Still trying to fix my problem with letter-spacing and fixed width
spaces. Do I understand that correctly that XSL-FO's view of
letter-spacing is different than, say, PDF's? PDF's character spacing 
(PDF 1.4, 5.2.1) is designed so it advances the cursor for each (!)
character by the Tc value. FO on the other side applies half the
letter-spacing value on the start and end side of the glyph, and only
for the characters that are classified as "Alphabetic" by Unicode. And I
don't even say anything about setting precendence and conditionality to
anything else than the default.

The weird thing about this definition in FO is that these spaces are
added (by default) in every case, i.e. even at the beginning of a line:

("|"=line boundaries, "_"=spaces generated by letter-spacing)
(The text "text text text" is used with text-align-last="justify" here)

letter-spacing="normal":

|text              text              text|

(Note: FOP does have the "permission" by the FO spec to increase the
inter-character gap here but we don't right now.)

letter-spacing="1pt":

|_t__e__x__t_  _t__e__x__t_  _t__e__x__t_|


PDF's character spacing would work like this, I think (although the last
character space needs to be eliminated by the layout manager [1]):

|t__e__x__t__  __t__e__x__t__  t__e__x__t|(__) <-- [1]


If I'm right here (not really sure, that's why I'm asking), it would
mean that we should probably stop using the Tc feature from PDF and
instead control the glyph positioning ourselves like we already do in
PostScript.

WDYT?

Jeremias Maerki


Re: letter-spacing

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
I think I've just found another bug in the spec. The "Alphabetic"
classification does not include characters like "-" or "/" (our current break
characters). So, assume a text that includes a string like "XSL-FO" and
allow for letter-spacing. Between "S" and "L", you get two half
letter-spaces (together 1 letter-space). Between "L" and "-", you get
only one half letter-space if you strictly follow the spec. Not quite
what I would expect and not what other layouters do (two commercial FO
implementations I checked, OpenOffice 2.0 and Word 2003). Grmbl.

On 01.03.2006 11:21:36 Jeremias Maerki wrote:
<snip/>
> FO on the other side applies half the
> letter-spacing value on the start and end side of the glyph, and only
> for the characters that are classified as "Alphabetic" by Unicode.
<snip/>

Jeremias Maerki


Re: letter-spacing

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 01.03.2006 16:44:39 Luca Furini wrote:
> Jeremias Maerki wrote:
> 
> > > The recommendation states that "The algorithm for resolving the adjusted 
> > > values between word spacing and letter spacing is User Agent dependent." 
> > > (7.17.2 in the candidate recommendation), so I think this is not a wrong 
> > > behaviour: it just assumes that word spaces have a higher precedence than 
> > > letter spaces.
> > 
> > No, actually in both cases the precedence is "force" so all spaces
> > survive the resolution process.
> 
> So, just to check I understood:
> 
> - according to the pdf specifications between two words there is
>    1 word space + 2 letter spaces

1 ls + 1 ws + 1 ls, yes.

> - according to the xsl recommendation there is
>    1 word space + 1 letter space (or better, two half letter spaces)

yes, more or less. Even the word space is separated into two halves.

> - fop currently puts just a word space

yes

> Is this correct?
> 
> But I still don't understand what the words concerning "adjusted values 
> between word spacing and letter spacing" are supposed to mean ...

I've been wondering about that, too. The user agent has some freedom
about choosing default letter and word spacing. The letter- and
word-spacing properties specify spaces in addition to the default spaces.
Maybe this applies to the default spacing. If you just take the space
traits generated by the spacing properties, then it's clear that the
normal space-resolution rules apply. Hmm.

> > However, while I was out for a few hours I was thinking about this and I 
> > came to the conclusion that it may make sense to keep an array of 
> > character offsets as an attribute of a WordArea in the area tree.
> 
> It would probably be the best way to deal with kerning too.

That's one of my next topics. It's actually another reason why I thought
about this. Just forgot to list it.

> My only concern is about the resulting pdf size: if we specify an offset 
> for each character, wouldn't it become (at least) twice as big as before?

I don't think it gets twice as big, but yes, we could not use the more
space-efficient commands anymore in that case. But it's only plain text
which is easily compressed with the Flate algorithm.

Jeremias Maerki


Re: letter-spacing

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 01.03.2006 15:30:09 Luca Furini wrote:
> Jeremias Maerki wrote:
> 
> > Still trying to fix my problem with letter-spacing and fixed width
> > spaces. Do I understand that correctly that XSL-FO's view of
> > letter-spacing is different than, say, PDF's? PDF's character spacing 
> > (PDF 1.4, 5.2.1) is designed so it advances the cursor for each (!)
> > character by the Tc value.
> 
> Yes, I remember that when I was working on letter spacing it took me a 
> while to understand what was wrong with the resulting pdf! :-)
> 
> > letter-spacing="1pt":
> > 
> > |_t__e__x__t_  _t__e__x__t_  _t__e__x__t_|

Hey, I'm again making a complete fool of myself. Mr. Space-Resolution
doesn't get the simplest of rules right. Of course, the first and the
last space is removed due to conditionality="discard" (starts/ends a
reference area). So it must actually be:

|t__e__x__t_   _t__e__x__t_   _t__e__x__t|

Grrrrrrr.

> At the moment, fop has
> 
>    |t__e__x__t  t__e__x__t  t__e__x__t|
> 
> in other words there are letter spaces only between letters, and not 
> between a letter and a space.

Yes, so even correcting my example we have a difference left to what the
spec says.

> The recommendation states that "The algorithm for resolving the adjusted 
> values between word spacing and letter spacing is User Agent dependent." 
> (7.17.2 in the candidate recommendation), so I think this is not a wrong 
> behaviour: it just assumes that word spaces have a higher precedence than 
> letter spaces.

No, actually in both cases the precedence is "force" so all spaces
survive the resolution process.

> Another little difference: each letter space depends on the preceding 
> letter size, instead of depending on both the preceding and following 
> letters sizes; but this has some visible effect only when a word is 
> composed of letters having different sizes.

Right.

> > PDF's character spacing would work like this, I think (although the last
> > character space needs to be eliminated by the layout manager [1]):
> > 
> > |t__e__x__t__  __t__e__x__t__  t__e__x__t|(__) <-- [1]
> 
> This is why the word spacing adjustment stored in the textAreas is not the 
> computed one, but is specifically modified in order to counterbalance the 
> 2 letter spaces that the pdf will add.
> 
> > If I'm right here (not really sure, that's why I'm asking), it would
> > mean that we should probably stop using the Tc feature from PDF and
> > instead control the glyph positioning ourselves like we already do in
> > PostScript.
> > 
> > WDYT?
> 
> As long as we have just two character categories (letter / spaces) the two 
> pdf operators were enough.
> 
> Now, with fixed width spaces too, which should be unaffected by the both 
> word spacing (such being different from spaces) and letter spacing 
> (differing from normal letters), two operators are too few.
> 
> I don't think we need to set the horizontal positioning of each character 
> or word, but just fix the placement of a character sequence following a 
> fixed width space, removing the letter spaces wrongly added by the Tc 
> operator, alternating character sequences and horizontal adjustments in 
> the TJ array.
> 
> HTH

It does. Thanks. Means I'm not on the wrong track. However, while I was
out for a few hours I was thinking about this and I came to the
conclusion that it may make sense to keep an array of character offsets
as an attribute of a WordArea in the area tree. Different reasons:
- The layout manager already knows exactly where each character should
go. At the moment we're somewhat mapping that knowledge into generic
properties and the renderer has to reproduce the effect. There's a
potential source for errors here.
- When at one point we go into details of letter-spacing and
word-spacing, this will get more important and most of all more
complicated.
- The renderer code for text becomes simpler if it simply can use the
relative offsets from the area tree.
This change doesn't have to happen right now, but it may be worth
keeping in mind for later. I think we can still live with a few
simplifications for now.

Jeremias Maerki