You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Joel Hirsh <jo...@gmail.com> on 2015/08/28 06:00:45 UTC

Re: Saving images of PDF pages in version 2

Don't know why, but google is not giving me messages back...

I have a couple such PDF's from different sources, but they have others
people's personal information in them, and when I try to redact that, the
problem goes away.  Is there a place to send a page that is not going to
become completely public?

Thanks

> Could you please upload such a PDF somewhere?
>
> Tilman
>
> Am 26.08.2015 um 18:42 schrieb Joel Hirsh:
> > I am trying to use PDFBox 2 to save images of PDF pages.  If I have a
> > scanned document or a PDF that was created with images, everything works
> > fine.
> >
> > However, if I have scanned document that had OCR done to it, then I get
> > blank images. Even if I delete the OCR text that overlays the image (using
> > NitroPDF), still nothing.  If I have Acrobat print the file to an image,
> > then as expected, its OK again.
> >
> > To create the image I am looping through the pages with
> >
> >              PDPageTree pages = document.getDocumentCatalog().getPages();
> >              Iterator<PDPage> iter = pages.iterator();
> >
> > and then using
> >
> >               BufferedImage pageimage = new PDFRenderer(
> > document).renderImageWithDPI(i, 300.0f);
> >
> >
> > Am I doing something wrong or is there something else I should be doing?
> > Or is this a bug?
> >
> > Thanks
> >
>
>

Re: Saving images of PDF pages in version 2

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 28.08.2015 um 07:25 schrieb Tilman Hausherr:
> Am 28.08.2015 um 06:00 schrieb Joel Hirsh:
>> Don't know why, but google is not giving me messages back...
>>
>> I have a couple such PDF's from different sources, but they have others
>> people's personal information in them, and when I try to redact that, 
>> the
>> problem goes away.  Is there a place to send a page that is not going to
>> become completely public?
>
> Yes, it's called e-mail, tilman at snafu dot de, it would then go only 
> to me and a few dozen intelligence services that read every people's 
> mail.
>
> But seriously, I'd advise against sharing files with personal 
> information on them, unless you got the permission of the people 
> involved, because it could bring you in trouble with the law.
>
> Are you sure you are using the latest version? We've have had at least 
> two people recently who used 2.0 versions not up-to-date. I also ask 
> because we recently solved a problem just like the one described.
>
> Tilman

Another thing to check would be wether you get any log messages. E.g. 
missing image libs.

Also possibly helpful would be a screenshot of PDFDebugger (in the menu, 
choose "View", "Show pages", then expand as shown below). Attaching 
doesn't work in the list, but embedding does:



Tilman


>
>
>
>
>>
>> Thanks
>>
>>> Could you please upload such a PDF somewhere?
>>>
>>> Tilman
>>>
>>> Am 26.08.2015 um 18:42 schrieb Joel Hirsh:
>>>> I am trying to use PDFBox 2 to save images of PDF pages.  If I have a
>>>> scanned document or a PDF that was created with images, everything 
>>>> works
>>>> fine.
>>>>
>>>> However, if I have scanned document that had OCR done to it, then I 
>>>> get
>>>> blank images. Even if I delete the OCR text that overlays the image 
>>>> (using
>>>> NitroPDF), still nothing.  If I have Acrobat print the file to an 
>>>> image,
>>>> then as expected, its OK again.
>>>>
>>>> To create the image I am looping through the pages with
>>>>
>>>>               PDPageTree pages = 
>>>> document.getDocumentCatalog().getPages();
>>>>               Iterator<PDPage> iter = pages.iterator();
>>>>
>>>> and then using
>>>>
>>>>                BufferedImage pageimage = new PDFRenderer(
>>>> document).renderImageWithDPI(i, 300.0f);
>>>>
>>>>
>>>> Am I doing something wrong or is there something else I should be 
>>>> doing?
>>>> Or is this a bug?
>>>>
>>>> Thanks
>>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

Re: Saving images of PDF pages in version 2

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 28.08.2015 um 06:00 schrieb Joel Hirsh:
> Don't know why, but google is not giving me messages back...
>
> I have a couple such PDF's from different sources, but they have others
> people's personal information in them, and when I try to redact that, the
> problem goes away.  Is there a place to send a page that is not going to
> become completely public?

Yes, it's called e-mail, tilman at snafu dot de, it would then go only 
to me and a few dozen intelligence services that read every people's mail.

But seriously, I'd advise against sharing files with personal 
information on them, unless you got the permission of the people 
involved, because it could bring you in trouble with the law.

Are you sure you are using the latest version? We've have had at least 
two people recently who used 2.0 versions not up-to-date. I also ask 
because we recently solved a problem just like the one described.

Tilman




>
> Thanks
>
>> Could you please upload such a PDF somewhere?
>>
>> Tilman
>>
>> Am 26.08.2015 um 18:42 schrieb Joel Hirsh:
>>> I am trying to use PDFBox 2 to save images of PDF pages.  If I have a
>>> scanned document or a PDF that was created with images, everything works
>>> fine.
>>>
>>> However, if I have scanned document that had OCR done to it, then I get
>>> blank images. Even if I delete the OCR text that overlays the image (using
>>> NitroPDF), still nothing.  If I have Acrobat print the file to an image,
>>> then as expected, its OK again.
>>>
>>> To create the image I am looping through the pages with
>>>
>>>               PDPageTree pages = document.getDocumentCatalog().getPages();
>>>               Iterator<PDPage> iter = pages.iterator();
>>>
>>> and then using
>>>
>>>                BufferedImage pageimage = new PDFRenderer(
>>> document).renderImageWithDPI(i, 300.0f);
>>>
>>>
>>> Am I doing something wrong or is there something else I should be doing?
>>> Or is this a bug?
>>>
>>> Thanks
>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org