You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by "Richter, Michael" <m....@tu-berlin.de> on 2019/02/01 09:54:14 UTC
Re: Choosing a font for non-ASCII characters
Hi,
A few weeks ago I had issues with unicode too. I switched the font to LiberationSans which is included in PDFBox:
PDFont font = PDType0Font.load(document,
PDDocument.class.getResourceAsStream("/org/apache/pdfbox/resources/ttf/LiberationSans-Regular.ttf"), true);
This works for me.
And I stumbled over this which may help you:
https://stackoverflow.com/questions/51481600/handle-many-unicode-caracters-with-pdfbox
--
Michael Richter
Am Mittwoch, den 30.01.2019, 20:56 -0500 schrieb Christopher Schultz:
Hello,
We are using PDFBox to generate PDFs in a very simple way and only
including fonts available from the PDType1Font class (e.g.
PDType1Font.HELVETICA). The PDFs we are generating are really only
including a few title/subtitles, text, and bulleted/numbered lists.
Everything is fine when we use what is probably in the standard Latin
alphabet, and we've had some troubles with special characters that
don't fit in there such as ≥ and ≤. We've dealt with that by simply
replacing "≤" with "<=" and so on, but we're starting to use languages
that don't use Latin script and so we can no longer replace out way
out of the problem.
For example, I need to be able to put Chinese characters into a PDF we
generate. So let's take the text "中國" which is just the word "China"
in Traditional Chinese script.
First, how can I find out that the character isn't going to fit into
the font that I'm currently using? Should I do it for every character
we try to put into the page, or should we just catch exceptions when
we try to write the text to the page and then scan at that point? I'm
trying to avoid writing hideously inefficient code to handle these
situations.
Second, once I know that I need to choose another font... how do I
know which font to choose? Should I keep a mapping of Unicode code
point ranges and the best fonts to use for them?
Finally, what fonts are actually available to PDFBox? How do I add new
ones? I have a lot of control over the environment and I get to see
failing conversions and intervene, so some trial and error is okay for
each new situation.
The recipients of our PDFs are file-size sensitive, so I'd only want
to include (bundle) a font in a PDF if it was absolutely necessary to
include the font itself. If we can get away with including a
*reference* to the font in the PDF and telling these recipients
"sorry, if you want to read the Chinese PDFs we send, you'd better
make sure you have font X installed" then that's okay with me, too.
What suggestions to people have for doing all of the above?
Thanks,
-chris
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>
For additional commands, e-mail: users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>