You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Lisa Moore <le...@jhmi.edu.INVALID> on 2024/01/10 12:39:36 UTC
FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
From: Lisa Moore
Sent: Tuesday, January 9, 2024 10:54 AM
To: users-help@pdfbox.apache.org
Subject: PDFBox 3.0.1 Font changes when rendering PDF to Image
Hi,
I am using PDFBox to render a PDF to a .png image. In the past, I used version 2.0.23 which worked without issue. When the image is rendered in verion 3.0.1, the text part of the PDF document does not properly convert the Font (Times Roman). How can I fix this issue? I have attached the images to show the comparison of what is being rendered in version 3.0.1 versus 2.0.23.
Thanks for any help you can provide.
Lisa Moore
Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
That's why I mentioned to look at the log messages, there would be one
mentioning that a fallback font is used.
The alternative would be to implement your own FontMapper. Call
FontMappers.set() with your own FontMapper. To see how to implement your
own, look at the source code of FontMapperImpl class.
All this is not trivial, probably a several days of work. The best would
be to expand the "lastResortFont" part to support all standard 14 fonts
instead of just having LiberationSans.
Tilman
On 10.01.2024 16:37, Lisa Moore wrote:
>
> I think the issue is that the required font it not on the Azure
> Kubernetes image that we are now running on. We are not allowed to
> load any fonts on this image. Is there a way to embed the required
> font into the java code that is creating the image from the PDF file?
> The java code is included below:
>
> *public**class*PDFToImage {
>
> *public**static*Object transformMessage(String baos) *throws*Exception
>
> {
>
> ByteArrayOutputStream[] _imageBaos_;
>
> *byte*[] decodedString=
> Base64./getDecoder/().decode(baos.getBytes("UTF-8"));
>
> // Get the input stream
>
> *try*(PDDocument pddDoc= Loader./loadPDF/(decodedString) ){
>
> PDFRenderer pr= *new*PDFRenderer (pddDoc);
>
> *int*pageCount= pddDoc.getNumberOfPages();
>
> BufferedImage bim= *new*BufferedImage(25,25,
> BufferedImage.*/TYPE_INT_ARGB/*);
>
> ByteArrayOutputStream stream= *new*ByteArrayOutputStream();
>
> imageBaos= *new*ByteArrayOutputStream[pageCount];
>
> *for*(*int*page= 0; page<pageCount;page++) {
>
> BufferedImage bimage= pr.renderImageWithDPI(page, 150, ImageType.*/RGB/*);
>
> bim= /joinBufferedImage/(bim,bimage);
>
> }
>
> ImageIO./write/(bim, "png", stream);
>
> pddDoc.close();
>
> *byte*[] bytes= stream.toByteArray();
>
> *return*bytes;
>
> } *catch*(IOException e) {
>
> e.printStackTrace();
>
> *throw**new*Exception(e);
>
> }
>
> }
>
> *private**static*BufferedImage joinBufferedImage(BufferedImage img1,
> BufferedImage img2) {
>
> // *TODO*Auto-generated method stub
>
> *int*offset= 5;
>
> *int*wid= Math./max/(img1.getWidth(),img2.getWidth() + offset);
>
> *int*height= img1.getHeight() + img2.getHeight() + offset;
>
> BufferedImage newImage=
> *new*BufferedImage(wid,height,BufferedImage.*/TYPE_INT_RGB/*);
>
> Graphics2D g2= newImage.createGraphics();
>
> Color oldColor= g2.getColor();
>
> g2.setPaint(Color.*/WHITE/*);
>
> g2.fillRect(0, 0, wid, height);
>
> g2.setColor(oldColor);
>
> g2.drawImage(img1, *null*, 0, 0);
>
> g2.drawImage(img2, *null*, 0, img1.getHeight() + offset);
>
> g2.dispose();
>
> *return*newImage;
>
> }
>
> }
>
> *From:* Tilman Hausherr <TH...@t-online.de>
> *Sent:* Wednesday, January 10, 2024 10:17 AM
> *To:* users@pdfbox.apache.org
> *Subject:* Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> *
> ** External Email - Use Caution *
>
> Hi,
>
> I tested with 3.0.1 and got one log message:
>
> Unexpected XRefTable Entry: 0 24
>
> that's because that line is " 0 24" instead of "0 24". However
> that doesn't seem to have a negative effect. Here's how the image looks:
>
> Tilman
>
> On 10.01.2024 15:52, Lisa Moore wrote:
>
> A sample PDF file can be seen here:
>
> https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0 <https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0>
>
> -----Original Message-----
>
> From: Tilman Hausherr<TH...@t-online.de> <ma...@t-online.de>
>
> Sent: Wednesday, January 10, 2024 8:09 AM
>
> To:users@pdfbox.apache.org
>
> Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> External Email - Use Caution
>
> Hi,
>
> We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
>
> Also try to use the latest snapshot
>
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
>
> and look at the log messages.
>
> Tilman
>
> On 10.01.2024 13:39, Lisa Moore wrote:
>
> *From:* Lisa Moore
>
> *Sent:* Tuesday, January 9, 2024 10:54 AM
>
> *To:*users-help@pdfbox.apache.org
>
> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> Hi,
>
> I am using PDFBox to render a PDF to a .png image. In the past, I
>
> used version 2.0.23 which worked without issue. When the image is
>
> rendered in verion 3.0.1, the text part of the PDF document does not
>
> properly convert the Font (Times Roman). How can I fix this issue?
>
> I have attached the images to show the comparison of what is being
>
> rendered in version 3.0.1 versus 2.0.23.
>
> Thanks for any help you can provide.
>
> Lisa Moore
>
> ---------------------------------------------------------------------
>
> To unsubscribe,e-mail:users-unsubscribe@pdfbox.apache.org
>
> For additional commands,e-mail:users-help@pdfbox.apache.org
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
>
> For additional commands, e-mail:users-help@pdfbox.apache.org
>
RE: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
Posted by Lisa Moore <le...@jhmi.edu.INVALID>.
I think the issue is that the required font it not on the Azure Kubernetes image that we are now running on. We are not allowed to load any fonts on this image. Is there a way to embed the required font into the java code that is creating the image from the PDF file? The java code is included below:
public class PDFToImage {
public static Object transformMessage(String baos) throws Exception
{
ByteArrayOutputStream[] imageBaos;
byte[] decodedString = Base64.getDecoder().decode(baos.getBytes("UTF-8"));
// Get the input stream
try(PDDocument pddDoc = Loader.loadPDF(decodedString) ){
PDFRenderer pr = new PDFRenderer (pddDoc);
int pageCount = pddDoc.getNumberOfPages();
BufferedImage bim = new BufferedImage(25,25, BufferedImage.TYPE_INT_ARGB);
ByteArrayOutputStream stream = new ByteArrayOutputStream();
imageBaos = new ByteArrayOutputStream[pageCount];
for (int page = 0; page<pageCount;page++) {
BufferedImage bimage = pr.renderImageWithDPI(page, 150, ImageType.RGB);
bim = joinBufferedImage(bim,bimage);
}
ImageIO.write(bim, "png", stream);
pddDoc.close();
byte[] bytes = stream.toByteArray();
return bytes;
} catch (IOException e) {
e.printStackTrace();
throw new Exception(e);
}
}
private static BufferedImage joinBufferedImage(BufferedImage img1, BufferedImage img2) {
// TODO Auto-generated method stub
int offset = 5;
int wid = Math.max(img1.getWidth(),img2.getWidth() + offset);
int height = img1.getHeight() + img2.getHeight() + offset;
BufferedImage newImage = new BufferedImage(wid,height,BufferedImage.TYPE_INT_RGB);
Graphics2D g2 = newImage.createGraphics();
Color oldColor = g2.getColor();
g2.setPaint(Color.WHITE);
g2.fillRect(0, 0, wid, height);
g2.setColor(oldColor);
g2.drawImage(img1, null, 0, 0);
g2.drawImage(img2, null, 0, img1.getHeight() + offset);
g2.dispose();
return newImage;
}
}
From: Tilman Hausherr <TH...@t-online.de>
Sent: Wednesday, January 10, 2024 10:17 AM
To: users@pdfbox.apache.org
Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
External Email - Use Caution
Hi,
I tested with 3.0.1 and got one log message:
Unexpected XRefTable Entry: 0 24
that's because that line is " 0 24" instead of "0 24". However that doesn't seem to have a negative effect. Here's how the image looks:
[cid:image001.png@01DA43B0.E1C6D3E0]
Tilman
On 10.01.2024 15:52, Lisa Moore wrote:
A sample PDF file can be seen here:
https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0
-----Original Message-----
From: Tilman Hausherr <TH...@t-online.de>
Sent: Wednesday, January 10, 2024 8:09 AM
To: users@pdfbox.apache.org<ma...@pdfbox.apache.org>
Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
External Email - Use Caution
Hi,
We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
Also try to use the latest snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
and look at the log messages.
Tilman
On 10.01.2024 13:39, Lisa Moore wrote:
*From:* Lisa Moore
*Sent:* Tuesday, January 9, 2024 10:54 AM
*To:* users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>
*Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
Hi,
I am using PDFBox to render a PDF to a .png image. In the past, I
used version 2.0.23 which worked without issue. When the image is
rendered in verion 3.0.1, the text part of the PDF document does not
properly convert the Font (Times Roman). How can I fix this issue?
I have attached the images to show the comparison of what is being
rendered in version 3.0.1 versus 2.0.23.
Thanks for any help you can provide.
Lisa Moore
---------------------------------------------------------------------
To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>
For additional commands, e-mail:users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>
For additional commands, e-mail: users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>
Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
I tested with 3.0.1 and got one log message:
Unexpected XRefTable Entry: 0 24
that's because that line is " 0 24" instead of "0 24". However that
doesn't seem to have a negative effect. Here's how the image looks:
Tilman
On 10.01.2024 15:52, Lisa Moore wrote:
> A sample PDF file can be seen here:
> https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0
>
> -----Original Message-----
> From: Tilman Hausherr<TH...@t-online.de>
> Sent: Wednesday, January 10, 2024 8:09 AM
> To:users@pdfbox.apache.org
> Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
>
> External Email - Use Caution
>
>
>
> Hi,
>
> We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
> Also try to use the latest snapshot
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
> and look at the log messages.
> Tilman
>
> On 10.01.2024 13:39, Lisa Moore wrote:
>> *From:* Lisa Moore
>> *Sent:* Tuesday, January 9, 2024 10:54 AM
>> *To:*users-help@pdfbox.apache.org
>> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>>
>> Hi,
>>
>> I am using PDFBox to render a PDF to a .png image. In the past, I
>> used version 2.0.23 which worked without issue. When the image is
>> rendered in verion 3.0.1, the text part of the PDF document does not
>> properly convert the Font (Times Roman). How can I fix this issue?
>> I have attached the images to show the comparison of what is being
>> rendered in version 3.0.1 versus 2.0.23.
>>
>> Thanks for any help you can provide.
>>
>> Lisa Moore
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe,e-mail:users-unsubscribe@pdfbox.apache.org
>> For additional commands,e-mail:users-help@pdfbox.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org
>
RE: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
Posted by Lisa Moore <le...@jhmi.edu.INVALID>.
A sample PDF file can be seen here:
https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0
-----Original Message-----
From: Tilman Hausherr <TH...@t-online.de>
Sent: Wednesday, January 10, 2024 8:09 AM
To: users@pdfbox.apache.org
Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
External Email - Use Caution
Hi,
We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
Also try to use the latest snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
and look at the log messages.
Tilman
On 10.01.2024 13:39, Lisa Moore wrote:
>
> *From:* Lisa Moore
> *Sent:* Tuesday, January 9, 2024 10:54 AM
> *To:* users-help@pdfbox.apache.org
> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> Hi,
>
> I am using PDFBox to render a PDF to a .png image. In the past, I
> used version 2.0.23 which worked without issue. When the image is
> rendered in verion 3.0.1, the text part of the PDF document does not
> properly convert the Font (Times Roman). How can I fix this issue?
> I have attached the images to show the comparison of what is being
> rendered in version 3.0.1 versus 2.0.23.
>
> Thanks for any help you can provide.
>
> Lisa Moore
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
We'd need the PDF file, please upload to a sharehoster. Your attachments
(all of them) didn't get through.
Also try to use the latest snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
and look at the log messages.
Tilman
On 10.01.2024 13:39, Lisa Moore wrote:
>
> *From:* Lisa Moore
> *Sent:* Tuesday, January 9, 2024 10:54 AM
> *To:* users-help@pdfbox.apache.org
> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> Hi,
>
> I am using PDFBox to render a PDF to a .png image. In the past, I
> used version 2.0.23 which worked without issue. When the image is
> rendered in verion 3.0.1, the text part of the PDF document does not
> properly convert the Font (Times Roman). How can I fix this issue?
> I have attached the images to show the comparison of what is being
> rendered in version 3.0.1 versus 2.0.23.
>
> Thanks for any help you can provide.
>
> Lisa Moore
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org