You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2017/03/22 05:57:41 UTC
[jira] [Comment Edited] (PDFBOX-3727) "premature EOF, image will be incomplete"

    [ https://issues.apache.org/jira/browse/PDFBOX-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935446#comment-15935446 ] 

Tilman Hausherr edited comment on PDFBOX-3727 at 3/22/17 5:57 AM:
------------------------------------------------------------------

Thank you for the file. It displays properly with Adobe Reader, but not with PDFBox and two other java products, and ghostscript crashes.

I'm going to sleep, and this investigation will take some time.

Memo for me: 
- investigate image Xop7 (smallest)
- fix possible bug with indexed colorspace in PDFDebugger
- rows != height in image dict, seemingly rows in DecodeParams incorrect



was (Author: tilman):
Thank you for the file. I displays properly with Adobe Reader, but not with PDFBox and two other java products, and ghostscript crashes.

I'm going to sleep, and this investigation will take some time.

Memo for me: 
- investigate image Xop7 (smallest)
- fix possible bug with indexed colorspace in PDFDebugger
- rows != height in image dict, seemingly rows in DecodeParams incorrect


> "premature EOF, image will be incomplete"
> -----------------------------------------
>
>                 Key: PDFBOX-3727
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3727
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.4, 2.0.5
>         Environment: Windows 10/X64
>            Reporter: Ravi
>              Labels: CCITTFaxDecode, ccitt
>
> I am trying to extract all the embeded images from Pdf file. But some times the images extracted are throwing warnings below.
> {code}
> [main] WARN  o.a.p.p.g.image.SampledImageReader - premature EOF, image will be incomplete
> {code}
> The extracted images are half-complete(half- greyed out)
> I would like to know if any solution available for this. Below is my code snippet
> Any Help is greatly appreciated.
> {code}
> 	public static void testPDFBoxExtractImages() throws Exception {
> 	    PDDocument document = PDDocument.load(new File(fileName));
> 	    PDPageTree list = document.getPages();
> 	    for (PDPage page : list) {
> 	        PDResources pdResources = page.getResources();
> 	        System.out.println(page.getRotation());
> 	        for (COSName c : pdResources.getXObjectNames()) {
> 	            PDXObject o = pdResources.getXObject(c);
> 	            if (o instanceof org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject) {
> 	                File file = new File("C:/temp/" + System.nanoTime() + ".png");
> 	                ImageIO.write(((org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject)o).getImage(), "png", file);
> 	            }
> 	        }
> 	    }
> 	}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org