You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Andreas Lehmkuehler <an...@lehmi.de> on 2013/08/06 19:58:04 UTC

Re: Having a problem using PDFParser

Hi,

Am 06.08.2013 19:07, schrieb Alain:
> Hi all,
>
> I have an application that generates a PDF. I download the pdf to the client and i'm using the PDFParser
> class to parse the PDF. Here is my code so far.
>
>
>
> DocFlavor flavor = DocFlavor.SERVICE_FORMATTED.PAGEABLE;
> PrintRequestAttributeSet patts = new HashPrintRequestAttributeSet();
> PrintService[] ps =
>   PrintServiceLookup.lookupPrintServices(flavor, patts);
> if (ps.length == 0) {
>      throw new IllegalStateException("No Printer found");
> }
>
> PrinterJob job = PrinterJob.getPrinterJob();
> job.setPrintService(ps[8]);
>
> PDFParser parser = new PDFParser(new ByteArrayInputStream(pdfDoc));
> parser.parse();
> job.setPageable(parser.getPDDocument());
> job.print();
>
>
> My problem is when a scanned image in a PDF format is processed, the page printed is blank.
>
> What i found was when a PDF file is created with PDF version 1.2, the image does display on the print out.
> If the PDF file created under version 1.4, it is blank.
>
> I was able to test this from using 2 different scanners, one generates a 1.2 file and the other a 1.4
>
> Can anyone shed some light on where i should be looking..
I guess it depends on the used image encoding (CCITT, DCT, JBIG2, JPX etc.),
some are well supported others are not. Do you get some console log output 
(complains about unsupported filters)?

> Best Regards
> Alain

BR
Andreas Lehmkühler

Re: Having a problem using PDFParser

Posted by Andreas Lehmkuehler <an...@lehmi.de>.

Am 06.08.2013 21:05, schrieb Alain:
> Hi Andreas,
>
> No, no apparent complaints, the code is not throwing any exceptions.
>
> I only get a list of printers available which i am displaying in my code as expected.
OK, so maybe you should try the PDFBox PDFReader as described here [1] and see
if there are any console logs.
You might use the PDFBox PDFDebugger [1] to have a look at the internals of
your pdfs to find out which encoding is used.
Or, last but not least, upload both pdfs in question to some share hoster or
something similar, so that we can have a look ...

BTW: Which version of PDFBox are you using?

> Best Regards
> Alain
>
>
>
>
> ________________________________
>   From: Andreas Lehmkuehler <an...@lehmi.de>
> To: users@pdfbox.apache.org
> Sent: Tuesday, August 6, 2013 1:58 PM
> Subject: Re: Having a problem using PDFParser
>
>
> Hi,
>
> Am 06.08.2013 19:07, schrieb Alain:
>> Hi all,
>>
>> I have an application that generates a PDF. I download the pdf to the client and i'm using the PDFParser
>> class to parse the PDF. Here is my code so far.
>>
>>
>>
>> DocFlavor flavor = DocFlavor.SERVICE_FORMATTED.PAGEABLE;
>> PrintRequestAttributeSet patts = new HashPrintRequestAttributeSet();
>> PrintService[] ps =
>>     PrintServiceLookup.lookupPrintServices(flavor, patts);
>> if (ps.length == 0) {
>>        throw new IllegalStateException("No Printer found");
>> }
>>
>> PrinterJob job = PrinterJob.getPrinterJob();
>> job.setPrintService(ps[8]);
>>
>> PDFParser parser = new PDFParser(new ByteArrayInputStream(pdfDoc));
>> parser.parse();
>> job.setPageable(parser.getPDDocument());
>> job.print();
>>
>>
>> My problem is when a scanned image in a PDF format is processed, the page printed is blank.
>>
>> What i found was when a PDF file is created with PDF version 1.2, the image does display on the print out.
>> If the PDF file created under version 1.4, it is blank.
>>
>> I was able to test this from using 2 different scanners, one generates a 1.2 file and the other a 1.4
>>
>> Can anyone shed some light on where i should be looking..
> I guess it depends on the used image encoding (CCITT, DCT, JBIG2, JPX etc.),
> some are well supported others are not. Do you get some console log output
> (complains about unsupported filters)?
>
>> Best Regards
>> Alain
>
> BR
> Andreas Lehmkühler

BR
Andreas Lehmkühler

[1] http://pdfbox.apache.org/commandline/

Re: Having a problem using PDFParser

Posted by Alain <al...@yahoo.com>.

Hi Andreas,

No, no apparent complaints, the code is not throwing any exceptions. 

I only get a list of printers available which i am displaying in my code as expected.

Best Regards
Alain




________________________________
 From: Andreas Lehmkuehler <an...@lehmi.de>
To: users@pdfbox.apache.org 
Sent: Tuesday, August 6, 2013 1:58 PM
Subject: Re: Having a problem using PDFParser
 

Hi,

Am 06.08.2013 19:07, schrieb Alain:
> Hi all,
>
> I have an application that generates a PDF. I download the pdf to the client and i'm using the PDFParser
> class to parse the PDF. Here is my code so far.
>
>
>
> DocFlavor flavor = DocFlavor.SERVICE_FORMATTED.PAGEABLE;
> PrintRequestAttributeSet patts = new HashPrintRequestAttributeSet();
> PrintService[] ps =
>   PrintServiceLookup.lookupPrintServices(flavor, patts);
> if (ps.length == 0) {
>      throw new IllegalStateException("No Printer found");
> }
>
> PrinterJob job = PrinterJob.getPrinterJob();
> job.setPrintService(ps[8]);
>
> PDFParser parser = new PDFParser(new ByteArrayInputStream(pdfDoc));
> parser.parse();
> job.setPageable(parser.getPDDocument());
> job.print();
>
>
> My problem is when a scanned image in a PDF format is processed, the page printed is blank.
>
> What i found was when a PDF file is created with PDF version 1.2, the image does display on the print out.
> If the PDF file created under version 1.4, it is blank.
>
> I was able to test this from using 2 different scanners, one generates a 1.2 file and the other a 1.4
>
> Can anyone shed some light on where i should be looking..
I guess it depends on the used image encoding (CCITT, DCT, JBIG2, JPX etc.),
some are well supported others are not. Do you get some console log output 
(complains about unsupported filters)?

> Best Regards
> Alain

BR
Andreas Lehmkühler

Re: Having a problem using PDFParser

Posted by Alain <al...@yahoo.com>.

Hi Andreas,

in using PDFDebugger (great tool BTW) i found the following.

now please correct me if i'm looking at the wrong data,

i have found the filter used for the working PDF file is to be DCTDecode

and for the PDF files not working correctly, i have found a filter of jpxdecode and jbig2decode

i am using a fresh download of 1.8.2 


Thanks

Alain



________________________________
 From: Andreas Lehmkuehler <an...@lehmi.de>
To: users@pdfbox.apache.org 
Sent: Tuesday, August 6, 2013 1:58 PM
Subject: Re: Having a problem using PDFParser
 

Hi,

Am 06.08.2013 19:07, schrieb Alain:
> Hi all,
>
> I have an application that generates a PDF. I download the pdf to the client and i'm using the PDFParser
> class to parse the PDF. Here is my code so far.
>
>
>
> DocFlavor flavor = DocFlavor.SERVICE_FORMATTED.PAGEABLE;
> PrintRequestAttributeSet patts = new HashPrintRequestAttributeSet();
> PrintService[] ps =
>   PrintServiceLookup.lookupPrintServices(flavor, patts);
> if (ps.length == 0) {
>      throw new IllegalStateException("No Printer found");
> }
>
> PrinterJob job = PrinterJob.getPrinterJob();
> job.setPrintService(ps[8]);
>
> PDFParser parser = new PDFParser(new ByteArrayInputStream(pdfDoc));
> parser.parse();
> job.setPageable(parser.getPDDocument());
> job.print();
>
>
> My problem is when a scanned image in a PDF format is processed, the page printed is blank.
>
> What i found was when a PDF file is created with PDF version 1.2, the image does display on the print out.
> If the PDF file created under version 1.4, it is blank.
>
> I was able to test this from using 2 different scanners, one generates a 1.2 file and the other a 1.4
>
> Can anyone shed some light on where i should be looking..
I guess it depends on the used image encoding (CCITT, DCT, JBIG2, JPX etc.),
some are well supported others are not. Do you get some console log output 
(complains about unsupported filters)?

> Best Regards
> Alain

BR
Andreas Lehmkühler