You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Paulo R C Mello Junior (JIRA)" <ji...@apache.org> on 2013/10/02 21:51:44 UTC

[jira] [Updated] (PDFBOX-1731) Converting pdf to Image

     [ https://issues.apache.org/jira/browse/PDFBOX-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paulo R C Mello Junior updated PDFBOX-1731:
-------------------------------------------

    Attachment:     (was: 2514690_5.pdf)

> Converting pdf to Image
> -----------------------
>
>                 Key: PDFBOX-1731
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1731
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.2
>         Environment: Windows 8  and Linux 
> JDK 1.7
>            Reporter: Paulo R C Mello Junior
>              Labels: newbie
>
> I'm trying to convert a pdf page to image but an exception occurs:
> 17:28:20,652 ERROR [org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap] (Thread-69) Something went wrong ... the pixelmap doesn't contain any data.
> 17:28:20,654 WARN  [org.apache.pdfbox.util.operator.pagedrawer.Invoke] (Thread-69) getRGBImage returned NULL
> 17:28:20,661 INFO  [org.apache.pdfbox.util.PDFStreamEngine] (Thread-69) unsupported/disabled operation: i
> 17:28:36,809 ERROR [stderr] (Thread-70) Exception in thread "Thread-70" java.lang.OutOfMemoryError: Java heap space
> 17:28:36,811 ERROR [stderr] (Thread-70) 	at java.awt.image.DataBufferByte.<init>(DataBufferByte.java:92)
> 17:28:36,812 ERROR [stderr] (Thread-70) 	at java.awt.image.ComponentSampleModel.createDataBuffer(ComponentSampleModel.java:415)
> 17:28:36,814 ERROR [stderr] (Thread-70) 	at java.awt.image.Raster.createWritableRaster(Raster.java:941)
> 17:28:36,814 ERROR [stderr] (Thread-70) 	at javax.imageio.ImageTypeSpecifier.createBufferedImage(ImageTypeSpecifier.java:1073)
> 17:28:36,815 ERROR [stderr] (Thread-70) 	at javax.imageio.ImageReader.getDestination(ImageReader.java:2896)
> 17:28:36,816 ERROR [stderr] (Thread-70) 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.readInternal(JPEGImageReader.java:1066)
> 17:28:36,817 ERROR [stderr] (Thread-70) 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.read(JPEGImageReader.java:1034)
> 17:28:36,818 ERROR [stderr] (Thread-70) 	at javax.imageio.ImageIO.read(ImageIO.java:1448)
> 17:28:36,818 ERROR [stderr] (Thread-70) 	at javax.imageio.ImageIO.read(ImageIO.java:1352)
> 17:28:36,819 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.pdmodel.graphics.xobject.PDJpeg.getRGBImage(PDJpeg.java:264)
> 17:28:36,820 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:83)
> 17:28:36,821 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
> 17:28:36,823 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
> 17:28:36,824 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
> 17:28:36,825 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
> 17:28:36,826 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:125)
> 17:28:36,827 ERROR [stderr] (Thread-70) 	at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:769)
> My code:
> public static List<BufferedImage> getPdfPagesAsImages(String pdfPath)
> 			throws IOException {
> 		File f = new File(pdfPath);
> 		PDDocument pdfDocument = null;
> 		pdfDocument = PDDocument.loadNonSeq(f, null);
> 		List<BufferedImage> bImages = new ArrayList<BufferedImage>();
> 		try {
> 			System.out.println(pdfPath);
> 			int resolution = 185;
> 			if (pdfDocument != null) {
> 				@SuppressWarnings("unchecked")
> 				List<PDPage> pages = (List<PDPage>) pdfDocument
> 						.getDocumentCatalog().getAllPages();
> 				for (PDPage p : pages) {
> 					BufferedImage convertedImage = p.convertToImage(
> 							BufferedImage.TYPE_INT_RGB, resolution);
> 					if (isNegativeImage(convertedImage)) {
> 						bImages.add(invertNegativeImage(convertedImage));
> 					} else {
> 						bImages.add(convertedImage);
> 					}
> 				}
> 			}
> 		} catch (FileNotFoundException e) {
> 			e.printStackTrace();
> 			e.getMessage();
> 			e.getCause();
> 		} catch (IOException e) {
> 			e.printStackTrace();
> 			e.getMessage();
> 			e.getCause();
> 		} catch (Exception e) {
> 			e.printStackTrace();
> 			e.getMessage();
> 			e.getCause();
> 		} finally {
> 			pdfDocument.close();
> 		}
> 		return bImages;
> 	}
> I atached my pdf.
> I have e10000 pdf to verify and about 30% throws this kind of exception



--
This message was sent by Atlassian JIRA
(v6.1#6144)