You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Arnd Beißner <ar...@cappelino.de> on 2003/01/22 23:55:14 UTC

BUG: Mapping of ascii minus character in PS renderer

Hello there,

after some research I found and fixed a bug in the PS renderer
that can be a real nuisance.

The problem is as follows: The ascii (and Unicode) minus
character is mapped to the hyphen character by the PDF
renderer. The PostScript renderer instead maps it tho the
minus character. This happens because the generated
PS code reencodes the fonts to ISO Latin 1 encoding, which
handles ascii code 45 differently from the standard PS font
encoding.

Typograpically, the character at 45 in ISOLatin1 is a real minus,
and the character at 45 in Standard Encoding is a hyphen, which
is about half as wide as the minus in your average font. The
difference in your PS output can be quite destructive, as FOP
always formats assuming the width of the hyphen character...

A "patch" follows. The reason I'm not yet submitting a real diff
to Bugzilla is that I am a) extremely overloaded right now and
b) this really needs to be discussed:

Some thoughts on this
(by 'FOP' I mean formatter+PDF renderer code):

1. Who's right and who's wrong?
Either FOP  - or - the PS renderer is right, but who?

2. If FOP is right, then the PS renderer must be
fixed. This can be done either by fixing the method
renderWordArea or by changing the PS procedures.
However, the latter would increase PS file size
(can't copy the ISO latin 1 enconding as opposed
to the standard encoding), so I opted for changing
renderWordArea.

3. If FOP is wrong, then probably someone else
must fix it - I suppose I won't find the right place
for the fix easily.

Personally I think the PS renderer is wrong, since
the original Adobe PS character encoding maps
ascii 45 to the hyphen character and Adobe usually
knows what they're doing. Still, at that point in time,
Unicode wasn't there yet, so...

This is an issue that we may possible want 
to solve before 0.20.5 goes final. Personally, I won't
have time before the weekend to check with
the Unicode and/or XSL spec.

Any comments/ideas?

--------------- temp fix that I use ---------------------------
PSRenderer, method renderWordArea:
        for (int i = 0; i < l; i++) {
            char ch = s.charAt(i);
            char mch = fs.mapChar(ch);

            // temp fix abe: map ascii '-' to ISO latin 1 hyphen char
            if (mch == '-') {
              sb = sb.append("\\" + Integer.toOctalString(173));
            } else /* fix ends */ if (mch > 127) {
                sb = sb.append("\\" + Integer.toOctalString(mch));
            } else {
                String escape = "\\()[]{}";
                if (escape.indexOf(mch) >= 0) {
                    sb.append("\\");
                }
                sb = sb.append(mch);
            }
        }
--
Cappelino Informationstechnologie GmbH
Arnd Beißner


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: BUG: Mapping of ascii minus character in PS renderer

Posted by Christian Geisert <ch...@isu-gmbh.de>.
Jeremias Maerki wrote:

[..]

> I'll put your fix in but I can't guarantee that it'll be before
> Christian does the release.

Bug #15936 is still an open issue ...

I've mixed feelings about committing patches at this stage of the
release but it's ok if they are as simple as this one.
(I'll just thinking about committing the Namend Destination patch..)

Christian


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: BUG: Mapping of ascii minus character in PS renderer

Posted by Jeremias Maerki <de...@greenmail.ch>.
On 22.01.2003 23:55:14 Arnd Beißner wrote:
> Hello there,
> 
> after some research I found and fixed a bug in the PS renderer
> that can be a real nuisance.

Yeah, one that I never got round to fix.

> The problem is as follows: The ascii (and Unicode) minus
> character is mapped to the hyphen character by the PDF
> renderer. The PostScript renderer instead maps it tho the
> minus character. This happens because the generated
> PS code reencodes the fonts to ISO Latin 1 encoding, which
> handles ascii code 45 differently from the standard PS font
> encoding.
> 
> Typograpically, the character at 45 in ISOLatin1 is a real minus,
> and the character at 45 in Standard Encoding is a hyphen, which
> is about half as wide as the minus in your average font. The
> difference in your PS output can be quite destructive, as FOP
> always formats assuming the width of the hyphen character...
> 
> A "patch" follows. The reason I'm not yet submitting a real diff
> to Bugzilla is that I am a) extremely overloaded right now and
> b) this really needs to be discussed:
> 
> Some thoughts on this
> (by 'FOP' I mean formatter+PDF renderer code):
> 
> 1. Who's right and who's wrong?
> Either FOP  - or - the PS renderer is right, but who?

I'm sure that the PS renderer is wrong. When I wrote it I've used
ISOLatin1 encoding because it got more characters right than with
StandardEncoding. :-) I didn't want to spend too much time on this
because at that time the PS renderer was merely a proof-of-concept.

> 2. If FOP is right, then the PS renderer must be
> fixed. This can be done either by fixing the method
> renderWordArea or by changing the PS procedures.
> However, the latter would increase PS file size
> (can't copy the ISO latin 1 enconding as opposed
> to the standard encoding), so I opted for changing
> renderWordArea.

Not happy with that on the long run. For immediately fixing this it's ok.
When I rewrite the PS renderer for the redesign I intend to get that
right from the beginning. The problem is not just the hyphen character.
There are others. The problem is that the base14 fonts are set to
WinAnsiEncoding (see org.apache.fop.render.pdf.fonts.Helvetica) and the
PS renderer uses ISOLatin1. So, depending on the characters used you get
multiple mismatches not just the hyphen character. What we probably need
is a custom encoding scheme like Acrobat Reader uses when converting PDF
to PostScript (PDFEncoding). That'll be some work...

> 3. If FOP is wrong, then probably someone else
> must fix it - I suppose I won't find the right place
> for the fix easily.

FOP is right.

> Personally I think the PS renderer is wrong, since
> the original Adobe PS character encoding maps
> ascii 45 to the hyphen character and Adobe usually
> knows what they're doing. Still, at that point in time,
> Unicode wasn't there yet, so...
> 
> This is an issue that we may possible want 
> to solve before 0.20.5 goes final. Personally, I won't
> have time before the weekend to check with
> the Unicode and/or XSL spec.
> 
> Any comments/ideas?
> 
> --------------- temp fix that I use ---------------------------
<snip/>

I'll put your fix in but I can't guarantee that it'll be before
Christian does the release.

Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org