You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2017/03/01 22:55:05 UTC
Problem with unsupported characters in a font
All,
I'm getting an error when preparing text to write to a PDF document:
java.lang.IllegalArgumentException: U+2265 ('greaterequal') is not
available in this font's encoding: WinAnsiEncoding
at
org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:345)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
at
org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
It's obvious that the ≥ symbol isn't available in the font we are using
(probably the default set of fonts... we aren't doing anything fancy at
this point).
Is there a good way to "sanitize" a string for the current font?
I can just start building a character-by-character replacement table,
but that's a little too whack-a-mole for my tastes. I'd prefer to do
something like ask the API what characters aren't okay, replace them
with something that IS okay (like "?") and log a warning. Then we can
collect the warnings and map the characters in a nicer way later.
Is there any way to do that kind of thing with PDFBox?
Thanks,
-chris
Re: Problem with unsupported characters in a font
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 01.03.2017 um 23:55 schrieb Christopher Schultz:
> All,
>
> I'm getting an error when preparing text to write to a PDF document:
>
> java.lang.IllegalArgumentException: U+2265 ('greaterequal') is not
> available in this font's encoding: WinAnsiEncoding
> at
> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:345)
> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
> at
> org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
>
>
> It's obvious that the \u2265 symbol isn't available in the font we are using
> (probably the default set of fonts... we aren't doing anything fancy at
> this point).
>
> Is there a good way to "sanitize" a string for the current font?
You could call PDFont.encode() for each character and catch
IllegalArgumentException, and replace your string with whatever you like.
Btw maybe the symbol is available - you're using WinAnsiEncoding. If you
use font files (call PDType0Font.load()), then you can use much more glyphs.
Tilman
>
> I can just start building a character-by-character replacement table,
> but that's a little too whack-a-mole for my tastes. I'd prefer to do
> something like ask the API what characters aren't okay, replace them
> with something that IS okay (like "?") and log a warning. Then we can
> collect the warnings and map the characters in a nicer way later.
>
> Is there any way to do that kind of thing with PDFBox?
>
> Thanks,
> -chris
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org