You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Ethan Huang <yu...@gmail.com> on 2021/06/03 07:57:32 UTC

Re: Why does JDK 11 produce larger file size when rendering?

Hi Tilman,

I was distracted by other work.
Thanks for sharing the code you experimented with! I tried with your code
in JDK 8, 9, 10, 11, they all produced files with the same size, 368KB.
What are the JDK versions you tried?

Just checking if it is PDFBox related. I would say it is more likely Java
related but I am thinking if some Java changes would bring changes to
logics in PDFBox.

Here is the code I tried to experiment with PDFBox and different versions
of JDK. For the file I shared earlier, JDK 8 would produce smaller sizes,
although the code without PDFBox you shared above produces the same file
sizes in JDK8 and 11.
https://drive.google.com/file/d/1RLAT6doUXZSGH_81z45Bi5Ly-Cvjb87E/view?usp=sharing

On Fri, Apr 23, 2021 at 8:20 AM Tilman Hausherr <TH...@t-online.de>
wrote:

> It's definitively java and not PDFBox; I first did tests whether there
> are different rendering hints, but no. Even when not using antialiasing,
> there are differences in size. When using it there are differences in
> size but also in color count.
>
> Try this code that contains no PDFBox:
>
>
> int height = 3508;
> int width = 2480;
>
> BufferedImage bimg = new BufferedImage(width, height,
> BufferedImage.TYPE_INT_RGB);
> Graphics2D g = (Graphics2D) bimg.getGraphics();
> g.setColor(Color.WHITE);
> g.fillRect(0, 0, width, height);
> g.setColor(Color.BLACK);
>
> RenderingHints r = new RenderingHints(null);
> //r.put(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
> r.put(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
> g.setRenderingHints(r);
>
> int fontSize = 50;
> int vertMargin = 200;
> int leftMargin = 160;
> int eachOffset = fontSize * 3 / 2;
>
> Font f = new Font("Courier New", Font.BOLD, fontSize);
> g.setFont(f);
>
> int count = 1;
> String text = "123456789 123456789 123456789 123456789 123456789
> 123456789 ";
> while (vertMargin + (eachOffset * (count - 1)) < height - vertMargin)
> {
>      String line = String.format("Line %2d: %s", count, text);
>      g.drawChars(line.toCharArray(), 0, line.length(), leftMargin,
> vertMargin + eachOffset * (count - 1));
>      ++count;
> }
>
> g.dispose();
>
> Iterator<ImageWriter> imageWriters =
> ImageIO.getImageWritersByFormatName("png");
> ImageWriter writer = imageWriters.next();
> ImageWriteParam param = writer.getDefaultWriteParam();
> if (param.canWriteCompressed())
> {
>      param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
>      param.setCompressionQuality(0); // best
> }
> try (ImageOutputStream ios = ImageIO.createImageOutputStream(new
> File("test-" + System.getProperty("java.version") + ".png")))
> {
>      writer.setOutput(ios);
>      writer.write(null, new IIOImage(bimg, null, null), param);
> }
> writer.dispose();
>
>
> Tilman
>
> Am 23.04.2021 um 05:55 schrieb Tilman Hausherr:
> > Yes, I can confirm this. I tried with two versions of amazon corretto,
> > saving as PNG at 100 dpi.
> >
> > I need to do more tests with different PDF types to find out why/when
> > that happens. The two PNG files have a different color count. Because
> > PNG is non lossy it means that the higher color count exists before
> > saving.
> >
> > Tilman
> >
> > Am 23.04.2021 um 01:24 schrieb Ethan Huang:
> >> Hi Tilman,
> >>
> >> Thanks for the suggestion! I have tried with the version 2.0.23. I think
> >> the behavior is the same for different PDFBox versions.
> >> For sharing the file, would this Google Drive link work?
> >>
> https://drive.google.com/file/d/1Yizkg97z-xyHk9zQj9y9iqhCXr6PfN2S/view?usp=sharing
> >>
> >>
> >> I think there are some changes made in JDK 11 that are different from
> >> JDK
> >> 8, and the parts are used by PDFBox to render images from PDFs.
> >> It would be great if you can point out anything relevant for us to
> >> understand the cause.
> >>
> >>
> >> On Wed, Apr 21, 2021 at 7:40 PM Tilman Hausherr <TH...@t-online.de>
> >> wrote:
> >>
> >>> Please upload the files to a sharehoster. Also make sure you're using
> >>> 2.0.23.
> >>>
> >>> Tilman
> >>>
> >>> Am 21.04.2021 um 23:44 schrieb Ethan Huang:
> >>>> Hello community,
> >>>>
> >>>> When testing with JDK 11, we found it produces larger file size than
> >>>> JDK 8 for rendering PDF pages to images. I know PDFBOX uses the
> >>>> java.awt library to do the rendering but would like to learn more if
> >>>> we know why it produces such a difference and if it is configurable.
> >>>>
> >>>> I have attached a test doc we have but I believe this is common to all
> >>>> docs.
> >>>>
> >>>> JDK 8
> >>>> The size of the image produced from the first page: 74137 bytes
> >>>> The size of the image produced from the second page: 51874 bytes
> >>>>
> >>>> JDK 11
> >>>> The size of the image produced from the first page: 102464 bytes
> >>>> The size of the image produced from the second page: 69454 bytes
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>
> >>>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Why does JDK 11 produce larger file size when rendering?

Posted by Tilman Hausherr <TH...@t-online.de>.
This is now over a month ago and there was a new jdk version since then.

I have a size difference but this is between an older oracle jdk8 and 
the latest amazon jdk11. With amazon corretto (latest) it's the same.

I did not test your code. For that I would have to review it first what 
it does and it's too long.

I retried rendering your file. There is a size difference when using 
amazon corretto 8 and 11 when saving as png.

Tilman

Am 03.06.2021 um 09:57 schrieb Ethan Huang:
> Hi Tilman,
>
> I was distracted by other work.
> Thanks for sharing the code you experimented with! I tried with your code
> in JDK 8, 9, 10, 11, they all produced files with the same size, 368KB.
> What are the JDK versions you tried?
>
> Just checking if it is PDFBox related. I would say it is more likely Java
> related but I am thinking if some Java changes would bring changes to
> logics in PDFBox.
>
> Here is the code I tried to experiment with PDFBox and different versions
> of JDK. For the file I shared earlier, JDK 8 would produce smaller sizes,
> although the code without PDFBox you shared above produces the same file
> sizes in JDK8 and 11.
> https://drive.google.com/file/d/1RLAT6doUXZSGH_81z45Bi5Ly-Cvjb87E/view?usp=sharing
>
> On Fri, Apr 23, 2021 at 8:20 AM Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> It's definitively java and not PDFBox; I first did tests whether there
>> are different rendering hints, but no. Even when not using antialiasing,
>> there are differences in size. When using it there are differences in
>> size but also in color count.
>>
>> Try this code that contains no PDFBox:
>>
>>
>> int height = 3508;
>> int width = 2480;
>>
>> BufferedImage bimg = new BufferedImage(width, height,
>> BufferedImage.TYPE_INT_RGB);
>> Graphics2D g = (Graphics2D) bimg.getGraphics();
>> g.setColor(Color.WHITE);
>> g.fillRect(0, 0, width, height);
>> g.setColor(Color.BLACK);
>>
>> RenderingHints r = new RenderingHints(null);
>> //r.put(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
>> r.put(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
>> g.setRenderingHints(r);
>>
>> int fontSize = 50;
>> int vertMargin = 200;
>> int leftMargin = 160;
>> int eachOffset = fontSize * 3 / 2;
>>
>> Font f = new Font("Courier New", Font.BOLD, fontSize);
>> g.setFont(f);
>>
>> int count = 1;
>> String text = "123456789 123456789 123456789 123456789 123456789
>> 123456789 ";
>> while (vertMargin + (eachOffset * (count - 1)) < height - vertMargin)
>> {
>>       String line = String.format("Line %2d: %s", count, text);
>>       g.drawChars(line.toCharArray(), 0, line.length(), leftMargin,
>> vertMargin + eachOffset * (count - 1));
>>       ++count;
>> }
>>
>> g.dispose();
>>
>> Iterator<ImageWriter> imageWriters =
>> ImageIO.getImageWritersByFormatName("png");
>> ImageWriter writer = imageWriters.next();
>> ImageWriteParam param = writer.getDefaultWriteParam();
>> if (param.canWriteCompressed())
>> {
>>       param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
>>       param.setCompressionQuality(0); // best
>> }
>> try (ImageOutputStream ios = ImageIO.createImageOutputStream(new
>> File("test-" + System.getProperty("java.version") + ".png")))
>> {
>>       writer.setOutput(ios);
>>       writer.write(null, new IIOImage(bimg, null, null), param);
>> }
>> writer.dispose();
>>
>>
>> Tilman
>>
>> Am 23.04.2021 um 05:55 schrieb Tilman Hausherr:
>>> Yes, I can confirm this. I tried with two versions of amazon corretto,
>>> saving as PNG at 100 dpi.
>>>
>>> I need to do more tests with different PDF types to find out why/when
>>> that happens. The two PNG files have a different color count. Because
>>> PNG is non lossy it means that the higher color count exists before
>>> saving.
>>>
>>> Tilman
>>>
>>> Am 23.04.2021 um 01:24 schrieb Ethan Huang:
>>>> Hi Tilman,
>>>>
>>>> Thanks for the suggestion! I have tried with the version 2.0.23. I think
>>>> the behavior is the same for different PDFBox versions.
>>>> For sharing the file, would this Google Drive link work?
>>>>
>> https://drive.google.com/file/d/1Yizkg97z-xyHk9zQj9y9iqhCXr6PfN2S/view?usp=sharing
>>>>
>>>> I think there are some changes made in JDK 11 that are different from
>>>> JDK
>>>> 8, and the parts are used by PDFBox to render images from PDFs.
>>>> It would be great if you can point out anything relevant for us to
>>>> understand the cause.
>>>>
>>>>
>>>> On Wed, Apr 21, 2021 at 7:40 PM Tilman Hausherr <TH...@t-online.de>
>>>> wrote:
>>>>
>>>>> Please upload the files to a sharehoster. Also make sure you're using
>>>>> 2.0.23.
>>>>>
>>>>> Tilman
>>>>>
>>>>> Am 21.04.2021 um 23:44 schrieb Ethan Huang:
>>>>>> Hello community,
>>>>>>
>>>>>> When testing with JDK 11, we found it produces larger file size than
>>>>>> JDK 8 for rendering PDF pages to images. I know PDFBOX uses the
>>>>>> java.awt library to do the rendering but would like to learn more if
>>>>>> we know why it produces such a difference and if it is configurable.
>>>>>>
>>>>>> I have attached a test doc we have but I believe this is common to all
>>>>>> docs.
>>>>>>
>>>>>> JDK 8
>>>>>> The size of the image produced from the first page: 74137 bytes
>>>>>> The size of the image produced from the second page: 51874 bytes
>>>>>>
>>>>>> JDK 11
>>>>>> The size of the image produced from the first page: 102464 bytes
>>>>>> The size of the image produced from the second page: 69454 bytes
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org