You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Lisa Moore <le...@jhmi.edu.INVALID> on 2024/01/10 12:39:36 UTC

FW: PDFBox 3.0.1 Font changes when rendering PDF to Image


From: Lisa Moore
Sent: Tuesday, January 9, 2024 10:54 AM
To: users-help@pdfbox.apache.org
Subject: PDFBox 3.0.1 Font changes when rendering PDF to Image

Hi,

I am using PDFBox to render a PDF to a .png image.  In the past,  I used version 2.0.23 which worked without issue.  When the image is rendered in verion 3.0.1, the text part of the PDF document does not properly convert the Font (Times Roman).   How can I fix this issue?   I have attached the images to show the comparison of what is being rendered in version 3.0.1 versus 2.0.23.

Thanks for any help you can provide.

Lisa Moore

Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

That's why I mentioned to look at the log messages, there would be one 
mentioning that a fallback font is used.

The alternative would be to implement your own FontMapper. Call 
FontMappers.set() with your own FontMapper. To see how to implement your 
own, look at the source code of FontMapperImpl class.

All this is not trivial, probably a several days of work. The best would 
be to expand the "lastResortFont" part to support all standard 14 fonts 
instead of just having LiberationSans.

Tilman



On 10.01.2024 16:37, Lisa Moore wrote:
>
> I think the issue is that the required font it not on the Azure 
> Kubernetes image that we are now running on.   We are not allowed to 
> load any fonts on this image.   Is there a way to embed the required 
> font into the java code that is creating the image from the PDF file?  
> The java code is included below:
>
> *public**class*PDFToImage  {
>
> *public**static*Object transformMessage(String baos) *throws*Exception
>
> {
>
>      ByteArrayOutputStream[] _imageBaos_;
>
> *byte*[] decodedString= 
> Base64./getDecoder/().decode(baos.getBytes("UTF-8"));
>
> // Get the input stream
>
> *try*(PDDocument pddDoc=  Loader./loadPDF/(decodedString) ){
>
> PDFRenderer pr= *new*PDFRenderer (pddDoc);
>
> *int*pageCount= pddDoc.getNumberOfPages();
>
> BufferedImage bim= *new*BufferedImage(25,25, 
> BufferedImage.*/TYPE_INT_ARGB/*);
>
> ByteArrayOutputStream stream= *new*ByteArrayOutputStream();
>
> imageBaos= *new*ByteArrayOutputStream[pageCount];
>
> *for*(*int*page= 0; page<pageCount;page++) {
>
> BufferedImage bimage= pr.renderImageWithDPI(page, 150, ImageType.*/RGB/*);
>
> bim= /joinBufferedImage/(bim,bimage);
>
>             }
>
>            ImageIO./write/(bim, "png", stream);
>
> pddDoc.close();
>
> *byte*[] bytes= stream.toByteArray();
>
> *return*bytes;
>
>         } *catch*(IOException e) {
>
> e.printStackTrace();
>
> *throw**new*Exception(e);
>
> }
>
> }
>
> *private**static*BufferedImage joinBufferedImage(BufferedImage img1, 
> BufferedImage img2) {
>
> // *TODO*Auto-generated method stub
>
> *int*offset= 5;
>
> *int*wid= Math./max/(img1.getWidth(),img2.getWidth() + offset);
>
> *int*height= img1.getHeight() + img2.getHeight() + offset;
>
> BufferedImage newImage= 
> *new*BufferedImage(wid,height,BufferedImage.*/TYPE_INT_RGB/*);
>
> Graphics2D g2= newImage.createGraphics();
>
> Color oldColor= g2.getColor();
>
> g2.setPaint(Color.*/WHITE/*);
>
> g2.fillRect(0, 0, wid, height);
>
> g2.setColor(oldColor);
>
> g2.drawImage(img1, *null*, 0, 0);
>
> g2.drawImage(img2, *null*, 0, img1.getHeight() + offset);
>
> g2.dispose();
>
> *return*newImage;
>
> }
>
> }
>
> *From:* Tilman Hausherr <TH...@t-online.de>
> *Sent:* Wednesday, January 10, 2024 10:17 AM
> *To:* users@pdfbox.apache.org
> *Subject:* Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> *
> **      External Email - Use Caution *
>
> Hi,
>
> I tested with 3.0.1 and got one log message:
>
> Unexpected XRefTable Entry: 0    24
>
> that's because that line is "     0 24" instead of "0 24". However 
> that doesn't seem to have a negative effect. Here's how the image looks:
>
> Tilman
>
> On 10.01.2024 15:52, Lisa Moore wrote:
>
>     A sample PDF file can be seen here:
>
>     https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0  <https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0>
>
>     -----Original Message-----
>
>     From: Tilman Hausherr<TH...@t-online.de>  <ma...@t-online.de>
>
>     Sent: Wednesday, January 10, 2024 8:09 AM
>
>     To:users@pdfbox.apache.org
>
>     Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
>            External Email - Use Caution
>
>     Hi,
>
>     We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
>
>     Also try to use the latest snapshot
>
>     https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
>
>     and look at the log messages.
>
>     Tilman
>
>     On 10.01.2024 13:39, Lisa Moore wrote:
>
>         *From:* Lisa Moore
>
>         *Sent:* Tuesday, January 9, 2024 10:54 AM
>
>         *To:*users-help@pdfbox.apache.org
>
>         *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
>         Hi,
>
>         I am using PDFBox to render a PDF to a .png image.  In the past,  I
>
>         used version 2.0.23 which worked without issue.  When the image is
>
>         rendered in verion 3.0.1, the text part of the PDF document does not
>
>         properly convert the Font (Times Roman).   How can I fix this issue?
>
>         I have attached the images to show the comparison of what is being
>
>         rendered in version 3.0.1 versus 2.0.23.
>
>         Thanks for any help you can provide.
>
>         Lisa Moore
>
>         ---------------------------------------------------------------------
>
>         To unsubscribe,e-mail:users-unsubscribe@pdfbox.apache.org
>
>         For additional commands,e-mail:users-help@pdfbox.apache.org
>
>     ---------------------------------------------------------------------
>
>     To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
>
>     For additional commands, e-mail:users-help@pdfbox.apache.org
>

RE: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image

Posted by Lisa Moore <le...@jhmi.edu.INVALID>.
I think the issue is that the required font it not on the Azure Kubernetes image that we are now running on.   We are not allowed to load any fonts on this image.   Is there a way to embed the required font into the java code that is creating the image from the PDF file?  The java code is included below:

public class PDFToImage  {


       public static Object transformMessage(String baos) throws Exception
       {
             ByteArrayOutputStream[] imageBaos;
             byte[] decodedString = Base64.getDecoder().decode(baos.getBytes("UTF-8"));
             // Get the input stream
             try(PDDocument pddDoc =  Loader.loadPDF(decodedString) ){
                    PDFRenderer pr = new PDFRenderer (pddDoc);
            int pageCount = pddDoc.getNumberOfPages();
            BufferedImage bim = new BufferedImage(25,25, BufferedImage.TYPE_INT_ARGB);
            ByteArrayOutputStream stream = new ByteArrayOutputStream();
            imageBaos = new ByteArrayOutputStream[pageCount];
            for (int page = 0; page<pageCount;page++) {
            BufferedImage bimage = pr.renderImageWithDPI(page, 150, ImageType.RGB);
            bim = joinBufferedImage(bim,bimage);
            }

           ImageIO.write(bim, "png", stream);
           pddDoc.close();
           byte[] bytes = stream.toByteArray();
           return bytes;

        } catch (IOException  e) {
            e.printStackTrace();
            throw new Exception(e);
        }

       }

       private static BufferedImage joinBufferedImage(BufferedImage img1, BufferedImage img2) {
             // TODO Auto-generated method stub
             int offset = 5;
             int wid = Math.max(img1.getWidth(),img2.getWidth() + offset);
             int height = img1.getHeight() + img2.getHeight() + offset;
             BufferedImage newImage = new BufferedImage(wid,height,BufferedImage.TYPE_INT_RGB);
             Graphics2D g2 = newImage.createGraphics();
             Color oldColor = g2.getColor();
             g2.setPaint(Color.WHITE);
             g2.fillRect(0, 0, wid, height);
             g2.setColor(oldColor);
             g2.drawImage(img1, null, 0, 0);
             g2.drawImage(img2, null, 0,  img1.getHeight() + offset);
             g2.dispose();

             return newImage;
       }


}

From: Tilman Hausherr <TH...@t-online.de>
Sent: Wednesday, January 10, 2024 10:17 AM
To: users@pdfbox.apache.org
Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image


      External Email - Use Caution




Hi,

I tested with 3.0.1 and got one log message:

Unexpected XRefTable Entry: 0    24

that's because that line is "     0    24" instead of "0 24". However that doesn't seem to have a negative effect. Here's how the image looks:

[cid:image001.png@01DA43B0.E1C6D3E0]

Tilman

On 10.01.2024 15:52, Lisa Moore wrote:

A sample PDF file can be seen here:

https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0



-----Original Message-----

From: Tilman Hausherr <TH...@t-online.de>

Sent: Wednesday, January 10, 2024 8:09 AM

To: users@pdfbox.apache.org<ma...@pdfbox.apache.org>

Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image





      External Email - Use Caution







Hi,



We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.

Also try to use the latest snapshot

https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/

and look at the log messages.

Tilman



On 10.01.2024 13:39, Lisa Moore wrote:



*From:* Lisa Moore

*Sent:* Tuesday, January 9, 2024 10:54 AM

*To:* users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>

*Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image



Hi,



I am using PDFBox to render a PDF to a .png image.  In the past,  I

used version 2.0.23 which worked without issue.  When the image is

rendered in verion 3.0.1, the text part of the PDF document does not

properly convert the Font (Times Roman).   How can I fix this issue?

I have attached the images to show the comparison of what is being

rendered in version 3.0.1 versus 2.0.23.



Thanks for any help you can provide.



Lisa Moore





---------------------------------------------------------------------

To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>

For additional commands, e-mail:users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>





---------------------------------------------------------------------

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>

For additional commands, e-mail: users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>





Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

I tested with 3.0.1 and got one log message:

Unexpected XRefTable Entry: 0    24

that's because that line is "     0 24" instead of "0 24". However that 
doesn't seem to have a negative effect. Here's how the image looks:


Tilman

On 10.01.2024 15:52, Lisa Moore wrote:
> A sample PDF file can be seen here:
> https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0
>
> -----Original Message-----
> From: Tilman Hausherr<TH...@t-online.de>
> Sent: Wednesday, January 10, 2024 8:09 AM
> To:users@pdfbox.apache.org
> Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image
>
>
>        External Email - Use Caution
>
>
>
> Hi,
>
> We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
> Also try to use the latest snapshot
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
> and look at the log messages.
> Tilman
>
> On 10.01.2024 13:39, Lisa Moore wrote:
>> *From:* Lisa Moore
>> *Sent:* Tuesday, January 9, 2024 10:54 AM
>> *To:*users-help@pdfbox.apache.org
>> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>>
>> Hi,
>>
>> I am using PDFBox to render a PDF to a .png image.  In the past,  I
>> used version 2.0.23 which worked without issue.  When the image is
>> rendered in verion 3.0.1, the text part of the PDF document does not
>> properly convert the Font (Times Roman).   How can I fix this issue?
>> I have attached the images to show the comparison of what is being
>> rendered in version 3.0.1 versus 2.0.23.
>>
>> Thanks for any help you can provide.
>>
>> Lisa Moore
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe,e-mail:users-unsubscribe@pdfbox.apache.org
>> For additional commands,e-mail:users-help@pdfbox.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org
>

RE: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image

Posted by Lisa Moore <le...@jhmi.edu.INVALID>.
A sample PDF file can be seen here:
https://www.dropbox.com/scl/fi/w5zgfrqbulungxd4dpq37/MuseTest.pdf?rlkey=jskisldanhoxf3pvcqqy6nk7b&dl=0

-----Original Message-----
From: Tilman Hausherr <TH...@t-online.de>
Sent: Wednesday, January 10, 2024 8:09 AM
To: users@pdfbox.apache.org
Subject: Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image


      External Email - Use Caution



Hi,

We'd need the PDF file, please upload to a sharehoster. Your attachments (all of them) didn't get through.
Also try to use the latest snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
and look at the log messages.
Tilman

On 10.01.2024 13:39, Lisa Moore wrote:
>
> *From:* Lisa Moore
> *Sent:* Tuesday, January 9, 2024 10:54 AM
> *To:* users-help@pdfbox.apache.org
> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> Hi,
>
> I am using PDFBox to render a PDF to a .png image.  In the past,  I
> used version 2.0.23 which worked without issue.  When the image is
> rendered in verion 3.0.1, the text part of the PDF document does not
> properly convert the Font (Times Roman).   How can I fix this issue?
> I have attached the images to show the comparison of what is being
> rendered in version 3.0.1 versus 2.0.23.
>
> Thanks for any help you can provide.
>
> Lisa Moore
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: FW: PDFBox 3.0.1 Font changes when rendering PDF to Image

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

We'd need the PDF file, please upload to a sharehoster. Your attachments 
(all of them) didn't get through.
Also try to use the latest snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
and look at the log messages.
Tilman

On 10.01.2024 13:39, Lisa Moore wrote:
>
> *From:* Lisa Moore
> *Sent:* Tuesday, January 9, 2024 10:54 AM
> *To:* users-help@pdfbox.apache.org
> *Subject:* PDFBox 3.0.1 Font changes when rendering PDF to Image
>
> Hi,
>
> I am using PDFBox to render a PDF to a .png image.  In the past,  I 
> used version 2.0.23 which worked without issue.  When the image is 
> rendered in verion 3.0.1, the text part of the PDF document does not 
> properly convert the Font (Times Roman).   How can I fix this issue?   
> I have attached the images to show the comparison of what is being 
> rendered in version 3.0.1 versus 2.0.23.
>
> Thanks for any help you can provide.
>
> Lisa Moore
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org