You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Armando Singer <ar...@gmail.com> on 2011/10/25 03:13:51 UTC

pdf to image text quality issue in java 7?

I've done a bit if testing of pdfbox under the new Java 7, update 1 release, and am noticing severe image quality issues when converting a pdf to an image.

Attached is the same pdf turned converted to an image under Java 6, then again with Java 7 with the same code. The Java 7 version looks pretty bad. Anyone notice an issue like this?

This is with jdk 1.7 update 1 (for solaris x64, running headless). I've also tested against the latest code in svn (the images below are from the most current version).

The good image below is from a recent version of the jdk 1.6 (and it has always looked good on at least jdk1.5+).

To test, I used code like this:

  public static BufferedImage toBufferedImage(final byte[] pdfBytes, final int resolution) throws IOException {
    PDDocument document = null;
    try {
      document = PDDocument.load(new ByteArrayInputStream(pdfBytes));
      final PDPage page = (PDPage) document.getDocumentCatalog().getAllPages().get(0);
      final BufferedImage result = page.convertToImage(BufferedImage.TYPE_INT_ARGB, resolution);
      return result;
    } finally {
      if (document != null) {
        document.close();
      }
    }
  }

And this is exported to png24 with:

ImageIO.write(bufferedImage, "png", outputStream);

I can create more text code and a sample pdf, but this should be reproducible on any pdf. The problem seems universal.

Thanks so much,
Armando